Data / Tech Consultant
Email Organizer
Python, Excel, Outlook
Introduction
For anyone who's had to create Outlook meetings with massive invitee lists you might have encountered this problem: how do you make sure everyone that needs to be in the meeting is actually in the invitation? The only way that I know of (for a non set invitee list) is to look through the "To:" and "CC" sections in the top of the invite and manually check each person in the list. To make this a little easier (and add the ability to compare in Excel to a set list of "core" invitees) I decided to make a tool that separates out the name from the email and gives both back to you as a CSV.
​
When I initially started this mini-project I assumed that this would easily be able to be accomplished in Excel by finding the first space from the end and dividing by that. To my dismay Excel does not have an easy way to iterate backwards through a cell and find a certain character.
​
So after using the convert text to columns tool in Excel to change the list of 100+ emails into 100 columns of emails I cracked open Jupyter Notebook, imported the excel file, and got to work.
Data Manipulation
Disclaimer: Many pictures included here will have a black box to cover sensitive user information to maintain privacy
First I imported the data from Excel as a Pandas DataFrame (the only package necessary to run this code is Pandas).
​
Next I added a dummy entry at the end of my DataFrame (I actually added this after writing all of my code as I encountered a problem that the last entry wasn't being added and this was the easiest way to fix it).
​
Then came the hard part. Thinking of a way to loop through each index in the DataFrame and then within that loop iterate backwards until the first space is found, where I'll then split the string into two strings, the name and the email. Let me explain my logic:
​
When you copy a list of emails in Outlook (say from a previously sent email) the copied emails will usually appear as, "Last Name, First Name, Middle Initial, Email" in a big chunk of text. If I were to iterate forwards in the string, I'll likely have to get to the third space (second if there's no middle initial) before I can separate between the name and email. This is why I iterate backwards, the first space should always be the separation between the email and the name. This also worked well when a colleague sent me a list of names with some of the names listing their company after the name (for contractors), which added one/two more spaces. When iterating backwards the first space was still the space between the email and name (I included the company in the name).
You can see how I did this from the comments in the picture above.
​
After these two lists, names and emails were created and filled I then had to flip each of the items in these lists back around. Because at this point they were separated, but backwards, and therefore looked like complete gibberish.
Now that the items in these lists are forwards, we can combine them into one nice Pandas DataFrame that's easily exportable to Excel. As shown below in an easy to read format:
Reflections
Overall this was a fun little project that helped to alleviate a problem that I often encounter in my project manager(ish) role at my company. Would it have also been fine to just read all of the names in paragraph format, instead of converting them into an easy to read CSV?
Yes, but it would just be slightly more infuriating. It's the little things that keep us going.