Sunday, April 29, 2018

Python Experience Summary

Teaching myself a new language has been a wonderful learning experience, and I am grateful that I have been compelled towards this endeavor during my final semester at John Carroll. I do believe that keeping up to date and continuing learning is critical especially in the fields of software development and computer science. Writing this blog has been a great reminder of that and an inspiration that independent learning outside of a structured classroom environment is very powerful.

This project has certainly had its ups and downs; some days, Python coding and syntax felt like a breeze, but other days not so much. There have been times in this project where I would have wanted to go to the professor to get a quick answer to my questions, but struggling through my problems on my own provided a greater reward in the end. Sometimes I'd be on the wrong track, but I would still be learning through that, and that is why this exercise has been so valuable. There are a crazy amount of resources out there for Python, and there is so much to learn. And then, even if you learn something one way, there may be ten other different ways to do the same thing!

Learning Python has provided me with a new strong point for my resume, as well as a skill that is applicable to my future career. Data Scientists at Progressive Insurance, my future employer, use Python as the language for just about all of their programs. My having this skill will be an invaluable tool as I work my way through the company. I hope that in the future I will continue to learn more in this language and have the opportunity to apply my knowledge. Even if I don't, this experience has taught me how to learn independently and continue my education beyond the bounds of a classroom, and I am all the more excited about what opportunities await for me in the future!

Sunday, April 22, 2018

Sending Emails with Python

After a couple hours of struggling to find a template that worked to send a simple email, after 5 versions attempting the code, and after much browsing through Python tutorials that claimed to have the answer, I finally found code that worked! Here is the basic structure of Python code that works (at least for me) to send an email:



The code is pretty simple. I start by importing the Python library smtplib, which allows us to send emails. Then I define constants such as email credentials as well as the message I wish to display, which I will use later. Next, I connect to the server, inputing the proper IP address and port for the gmail server from which I will be sending emails. Note that I created a new gmail account for the purpose of this project and I hid the password in the snip. The line that reads "server.startttls()" is a security function that protects my password when the code is running. The last three lines of code input the login credentials to the server, send the email, and quit the server.

Now, to put this together with the rest of our code.

First, I called the sendEmail method from the main method, inputting all the claim information for the claim whose email is to be sent:



Here is the method to send the email:


And the formatted email that the program successfully sends:



I had a lot of trouble getting a basic email to send. Like I said, I went through at least 5 different sources (and several hours) to find one that had code that worked for me. Even then, I had to make some adjustments for the program to fit my needs. A lot of the confusion for me came from the following bit of code:



What this code does is assign the From and To addresses and the Subject into the appropriate fields for the recipient of the email to see. Doing this, allows the expanded information to display to show the details of the email (sender, recipient, subject):




To test that the program works, to send more than one email, I created a new claim in the csv file with valid information to send to another one of my email accounts. The email did successfully send!

If you have any questions, please do not hesitate to ask in the comments!


Sources:
https://stackoverflow.com/questions/399129/failing-to-send-email-with-the-python-example
http://naelshiab.com/tutorial-send-email-python/

Saturday, April 14, 2018

Data Verification

Now that we have uploaded our file, we need to verify that the data we have is valid. To check that the information for each claim is complete, we verify the following:
  • That f_name, l_name, and email are not null
  • That claim_num is a 5 digit number
  • That phone_num is in the format ###-###-####
For each claim that contains errors, we will print out an error indicating what information is missing or invalid. If the data is all valid for a claim, we will state that the data is valid.

Here is the data we are reading from:



Here's the code:


And here is the output:


Notes on the code:
Before going on to read the rest of this post, I encourage you to read through the code provided above. Once you have done that, here are some observations:

  • Notice that this code is written using methods. We have a main() method where the program starts, and one additional method called check_phone_validity(). I wrote this method, because the checking of a valid phone number requires more code that a simple conditional statement. I wrote all the other checks in the main() method because they were simple and short enough to handle there. I could have written methods for each check case, but it was not necessary and is really a matter of preference. Note that this code could also have been written without using any methods at all. This is a great example of Python's flexibility and ease of use!
  • Since f_name, l_name, and email are strings, to check if they are null, we compare them to "". If we were working with integers, we would compare them to None. This is a distinction I came across and learned about as I was writing the code.
  • We use the boolean variable 'clean' to mark if the row of data being checked has any deficiencies, and if it does, clean is set to false, otherwise, the claim is reported to have valid data. In my next blog post, I will add a line at the very end of the method, under the last conditional, so that if clean is True, the program calls a method to send a formatted email to the claimant with the information for that claim. 
  • The check_phone_validity() method first checks that the input has 12 characters. Then it goes on to make sure that in positions 3 and 7 in the string, there must be a dash, in keeping with the format requirements. We return whether an invalid phone number affects the validity of the data.
As always, feel free to comment or ask questions in the comments section below!

Sources:

Sunday, April 8, 2018

File Import

As mentioned in the previous post, this post will be about how to import a csv file into the Python environment.

I started by creating an excel file called ClaimsToday.xls containing fictitious information for 6 different claims. In order to transition smoothly into the checks for validity that will be addressed in the next post, I configured the below data such that out of the six claims, only one claim has all valid data fields:



Note the following:

  • Claim 123456 has the email field missing
  • Claim 121212 has the l_name field missing
  • Claim 123123 has f_name and email missing
  • Claim 131313 is the one valid claim with all information correct
  • Claim 1234 has an invalid 4 digit claim number
  • Claim 111007 has a string in the phone field, where there needs to be a number
Given this data, moving forward, the program will need to flag all claims except claim 131313, whose information the program will email. The program will print a message describing the deficiencies in the other claims and will not attempt to email their information to the claimant.

Once I created the excel file, I saved it as ClaimsToday.csv.

Below is the code to import the csv file. We import and access the data using a reader. (If you were in Advance Webdesign last semester, we learned about this in that class, but using C#!) This code recognizes the first row of our data as fieldnames by default. After reading the file, I have a print statement to verify that my code worked the way I wanted it to.



Here's the result of the print statement:


Now that we have the data we need, in my next post I will be moving forward into showing you how we can check the data before sending out the email.


Sources:
https://docs.python.org/3/library/csv.html#csv.DictReader

Tuesday, April 3, 2018

Project Overview

Upon my graduation this May, I will be working at Progressive Insurance as a data analyst. From what I know, I will be working with insurance claims data in some capacity.

Throughout the next three posts, I will be building a Python program that does the following. This small program will simulate a large scale program that could be used by a major insurance company to be run daily and send out emails to claimants with the information for the claim they filed that day.

  1. The program will read a csv file ClaimsToday with claims information. Every row will hold information for a different claim. In reality, for a big insurance company, there would be thousands of rows, but my file will contain a small sample of claims. The various columns will hold the different pieces of information for each claim, including claim number, claimant first name, claimant last name, address, phone number, email, description, damage, location of accident, date of accident, time of accident, police involvement etc.
  2. Then the program will check the data for completeness and validity. Once the data is read into the program, we want to make sure that it is clean. Certain required fields must be filled, such as claim number, first and last name, and email. Other fields will have to be checked for formatting, for instance, if phone number is not a number or if claim number is not in the right format (we'll set the standard for a claim number to be six digits for the purpose of this exercise), the program should produce an error message to report the deficiencies in the data. Note that large corporations have entire departments dedicated to data cleansing, so we will be doing a very small example of this sort of work here.
  3. Finally, if the data is clean, the program will format a message with the information for each claim, which will be automatically sent to the email associated with the claim. I will be sending all emails to my personal email account for the purposes of testing the program.
My next blog post will cover the first step of this process, reading the data, so stay tuned!

Fun fact, I was Flo from Progressive this past Halloween! Until next time!