Protecting Your Data in the Mess of Old Web Accounts

I have over 120 online accounts, which is insane. I’m not even the sort of person who signs up for things on a regular basis – I have a natural aversion to opening these accounts. But all it takes is switching insurance every few years, applying to a handful of jobs here and there, or signing up for basic things you need or want every now and then, and 10 years later you are left with a mess. I want to talk about the nature of this mess and share some ideas on how to simply it.

First of all, I think it pays to avoid creating new accounts if you can. Some services try to strong-arm you into creating an account, such as Microsoft when you first setup a Windows 10 installation, but if you take things slowly, you might find a “skip” button that allows you to avoid this. Some sites will offer you discounts if you provide your email, but this is tricky: they probably wouldn’t offer this if the benefit to you was greater than the benefit to them. Your data has a price: make sure that you are prepared for the true cost if you wish to trade it.

The greatest danger to having a plethora of online accounts results from password reuse: if you are reusing an email/password or username/password combination on multiple sites, a hack of one site might leave your data vulnerable on other sites. This is why experts recommend never reusing passwords. Another danger is that even if one password is exposed, this could be added to a “dictionary” that can be tested on other email addresses in what is known as a “dictionary” attack. Basically, your password is typically stored as a hash on a website’s server. An attacker can take a huge list of known passwords, run the hash algorithm on each of those passwords, and compare the hash to the one stored in the database. If it’s a match on any row in the database, then the password can effectively be known. It requires a considerable degree of access to do this, but it can and does happen. If you’ve been using the same password on 80 sites, you could be in serious trouble depending on what data you put on those sites.

And that leads to an important question: what data is actually at risk? This is an important question, because sometimes not much is actually at risk. After performing some sneaky searches on myself, I discovered, to my chagrin, that my address and phone number already exist on the internet, and while that frustrates me, it’s reality for most people, too. It’s is becoming easier and easier for bad actors to sell data. So in a sense, you do need to ask yourself if it’s worth your time covering all this stuff up. But I think that is it, because data corrodes very easily, and the information about you that exists on the wider web isn’t perfectly accurate, and you should plan to keep it that way. Limiting your attack surface is critical, and it’s better to have a general system in place to prevent your data from being creeped onto the internet. Of course, if the only thing that gets leaked is a list of books you recommend, maybe that’s not such a big deal. Or maybe it is.

General Strategies

Follow at your own risk!

  1. Obscure Old Data
    • Names
      • Usually, your email address will expose your real name, but that doesn’t mean you need to make it easy if a web service is hacked. When an application “requires” you to have a First Name and Last Name entered, it is often just checking that you have at least one character for each. If you would like to obscure your name, you can often just as easily enter “R” instead of “Risky” and “F” instead of “Flannel”, or go one step further and use unrelated letters such “P” and “Q”. Ditto for addresses.
      • Most databases are periodically backed up and stored, and sometimes versions of data are recorded, but if you no longer need a service and it doesn’t provide you a way to delete your account, this is not a bad way to obscure your information.
      • Definitely don’t do this if you are legally required to provide accurate information, such as for a bank account. This is just good if you bought sports tickets through a website half a decade ago and you don’t expect to buy them again any time soon. The intention is not to lie, the intention is to decouple your data from the account in case that site gets hacked. It’s not your fault if websites don’t provide you with a way to delete your information, and most of them don’t.
    • Social Security Numbers
      • Let’s be honest, many of these have already been exposed over the years, but again, you don’t want to make it easy for anybody to get ahold of this. Sometimes the system only wants 9 numbers, it doesn’t necessary check they are correct. I find this to be important to obscure in old applications because you really don’t want this number getting out if you can prevent it. But again, only do this if an accurate number is not required by law and when you don’t intent to use the service again. Several job application sites require this, but if you don’t plan to use that site again, it can effectively be obscured.
    • Passwords
      • I’m guilty of reusing passwords. I’m also guilty of reusing pieces of passwords. I’ve been reluctant to switch to a Password Manager because it creates a single point of failure, but with over 120 online accounts, most accounts are never being used, and it occurs to me that there is no good reason for me to keep these accounts using familiar passwords. I’m in the process of using a password generator to create crazy random, unique passwords for all the old sites I never use but can’t delete, because believe it or not, these sites can expose my favorite passwords to risk. If ever those passwords are figured out from one or more hacks, it’s game over, and I can never reuse those passwords again. So if you don’t need to remember the login for a site, why not make it super unique? It might still be useful to record these passwords somewhere, which I do.
      • Remember, your password is data, too. You protect this data by not reusing passwords, and not having them reveal anything about yourself.
    • Emails
      • Emails are more difficult to change. Often, you cannot update your email address for an application without being required to confirm the email. You often cannot recover an account without this, either. I had an old Yahoo account many many years ago that I manually shut down. Some old accounts used that email, but when I go to recover the password, the email is sent to nowhere. It is virtually impossible to recover those accounts. In a sense, that’s not a bad thing, but in another sense, it’s possible that I can no longer get into those accounts to obscure my information.
      • I’m contemplating the strategy of using a dummy account for all the old sites I wish to forget about. If they have no important information about me, using an email address that includes a fake name inside of it might be a great way to completely disassociate the account from me (excluding IP logs and backup data, of course). If the data shows a first name of “A”, a last name of “A”, and an email address of johndoe9999[at]gmail[dot]com with a crazy random password, a breach of that data effectively tells hackers absolutely nothing about me. Again, just don’t forget that not all of your information is visible. I’m pretty certain government authorities can still get this information based on deeper logs, so don’t you be doing anything illegal! I have mixed feelings about this strategy in general, though, as you will still typically be reusing your dummy account for other sites. And if you want to go full tin-hat, finding a pattern to your actions across various accounts might at least be a flag that you are a decently determined cookie when it comes to covering your tracks.
      • There are also sites that allow you to set up temporary, technically public emails that are quickly disposed of. If you are certain an account has no important information on you and you want to be able to confirm an otherwise bogus email, this might actually do the trick, and maybe then you can effectively forget all about that account. Personally, I’d want to investigate how these sites work before I use them for this. They are, however, often used by developers for testing their applications with throw-away emails. It has occurred to me that these could be instrumental.
    • Resumes
      • Most job sites require you to upload a resume. It’s a pain in the butt, but it’s also dangerous because your resume says an awful lot about you. When faced with this, my solution has been to create a blank word document titled “Resume” and upload this in place of any existing resume. Usually, the system is just checking that a file is indeed attached. Who says it can’t be blank? The only downside to this, though, is that it only gets you so far. Job application accounts are easy to set up, but often extremely difficult to delete. It’s extremely likely a copy of your resume is still stored with the record of the job you applied for. Even then, I’d rather the resume “on file” be bogus if I don’t intend to use that site ever again.
  2. Delete Old Data
    • Obscuring data is fine and all, but what about deleting it entirely? Well, that sounds nice at first, but it’s not always as comprehensive as you might think. Perhaps this has changed, but it used to be that if you permanently deleted your Facebook account, pictures in which you tagged others wouldn’t be deleted, but would be orphaned off with the people you tagged, and thus would technically remain in the system. Ditto comments, etc. Sometimes, deleting an account does nothing but set a little flag in the database that will prevent you from logging in, but might not change anything otherwise about your information. Now, sometimes it does in fact delete everything, but I’ve never seen a website that spelled out their methodology clearly, and it’s my instinct not to trust companies to do what they say: I trust my own understanding of databases more. If you do have the option to delete an account, I think it makes sense to first obscure as much as you possibly can and THEN delete the account. And this doesn’t mean information won’t stay in the system, but it’s a start.
    • Also be aware, sometimes deleting an account means you can never create a new account with the same email address again. This is why it might be smarter to switch it to a dummy email address first, and as long as the application doesn’t track email changes and prevent every iteration from creating an account, obscuring the email first should effectively allow you to create a new account with your regular email address in the future. In theory.
  3. Avoid Providing Data in the First Place
    • As mentioned before, this is king. You don’t have to worry about all this crap if you don’t have a ton of accounts in the first place. Not everything can be avoided, but you might be surprised how much can. No database backups, no IP tracking, no versioning, no account to expose in the first place. Aim for this.