Sunday, April 29, 2018

Python Experience Summary

Teaching myself a new language has been a wonderful learning experience, and I am grateful that I have been compelled towards this endeavor during my final semester at John Carroll. I do believe that keeping up to date and continuing learning is critical especially in the fields of software development and computer science. Writing this blog has been a great reminder of that and an inspiration that independent learning outside of a structured classroom environment is very powerful.

This project has certainly had its ups and downs; some days, Python coding and syntax felt like a breeze, but other days not so much. There have been times in this project where I would have wanted to go to the professor to get a quick answer to my questions, but struggling through my problems on my own provided a greater reward in the end. Sometimes I'd be on the wrong track, but I would still be learning through that, and that is why this exercise has been so valuable. There are a crazy amount of resources out there for Python, and there is so much to learn. And then, even if you learn something one way, there may be ten other different ways to do the same thing!

Learning Python has provided me with a new strong point for my resume, as well as a skill that is applicable to my future career. Data Scientists at Progressive Insurance, my future employer, use Python as the language for just about all of their programs. My having this skill will be an invaluable tool as I work my way through the company. I hope that in the future I will continue to learn more in this language and have the opportunity to apply my knowledge. Even if I don't, this experience has taught me how to learn independently and continue my education beyond the bounds of a classroom, and I am all the more excited about what opportunities await for me in the future!

Sunday, April 22, 2018

Sending Emails with Python

After a couple hours of struggling to find a template that worked to send a simple email, after 5 versions attempting the code, and after much browsing through Python tutorials that claimed to have the answer, I finally found code that worked! Here is the basic structure of Python code that works (at least for me) to send an email:



The code is pretty simple. I start by importing the Python library smtplib, which allows us to send emails. Then I define constants such as email credentials as well as the message I wish to display, which I will use later. Next, I connect to the server, inputing the proper IP address and port for the gmail server from which I will be sending emails. Note that I created a new gmail account for the purpose of this project and I hid the password in the snip. The line that reads "server.startttls()" is a security function that protects my password when the code is running. The last three lines of code input the login credentials to the server, send the email, and quit the server.

Now, to put this together with the rest of our code.

First, I called the sendEmail method from the main method, inputting all the claim information for the claim whose email is to be sent:



Here is the method to send the email:


And the formatted email that the program successfully sends:



I had a lot of trouble getting a basic email to send. Like I said, I went through at least 5 different sources (and several hours) to find one that had code that worked for me. Even then, I had to make some adjustments for the program to fit my needs. A lot of the confusion for me came from the following bit of code:



What this code does is assign the From and To addresses and the Subject into the appropriate fields for the recipient of the email to see. Doing this, allows the expanded information to display to show the details of the email (sender, recipient, subject):




To test that the program works, to send more than one email, I created a new claim in the csv file with valid information to send to another one of my email accounts. The email did successfully send!

If you have any questions, please do not hesitate to ask in the comments!


Sources:
https://stackoverflow.com/questions/399129/failing-to-send-email-with-the-python-example
http://naelshiab.com/tutorial-send-email-python/

Saturday, April 14, 2018

Data Verification

Now that we have uploaded our file, we need to verify that the data we have is valid. To check that the information for each claim is complete, we verify the following:
  • That f_name, l_name, and email are not null
  • That claim_num is a 5 digit number
  • That phone_num is in the format ###-###-####
For each claim that contains errors, we will print out an error indicating what information is missing or invalid. If the data is all valid for a claim, we will state that the data is valid.

Here is the data we are reading from:



Here's the code:


And here is the output:


Notes on the code:
Before going on to read the rest of this post, I encourage you to read through the code provided above. Once you have done that, here are some observations:

  • Notice that this code is written using methods. We have a main() method where the program starts, and one additional method called check_phone_validity(). I wrote this method, because the checking of a valid phone number requires more code that a simple conditional statement. I wrote all the other checks in the main() method because they were simple and short enough to handle there. I could have written methods for each check case, but it was not necessary and is really a matter of preference. Note that this code could also have been written without using any methods at all. This is a great example of Python's flexibility and ease of use!
  • Since f_name, l_name, and email are strings, to check if they are null, we compare them to "". If we were working with integers, we would compare them to None. This is a distinction I came across and learned about as I was writing the code.
  • We use the boolean variable 'clean' to mark if the row of data being checked has any deficiencies, and if it does, clean is set to false, otherwise, the claim is reported to have valid data. In my next blog post, I will add a line at the very end of the method, under the last conditional, so that if clean is True, the program calls a method to send a formatted email to the claimant with the information for that claim. 
  • The check_phone_validity() method first checks that the input has 12 characters. Then it goes on to make sure that in positions 3 and 7 in the string, there must be a dash, in keeping with the format requirements. We return whether an invalid phone number affects the validity of the data.
As always, feel free to comment or ask questions in the comments section below!

Sources:

Sunday, April 8, 2018

File Import

As mentioned in the previous post, this post will be about how to import a csv file into the Python environment.

I started by creating an excel file called ClaimsToday.xls containing fictitious information for 6 different claims. In order to transition smoothly into the checks for validity that will be addressed in the next post, I configured the below data such that out of the six claims, only one claim has all valid data fields:



Note the following:

  • Claim 123456 has the email field missing
  • Claim 121212 has the l_name field missing
  • Claim 123123 has f_name and email missing
  • Claim 131313 is the one valid claim with all information correct
  • Claim 1234 has an invalid 4 digit claim number
  • Claim 111007 has a string in the phone field, where there needs to be a number
Given this data, moving forward, the program will need to flag all claims except claim 131313, whose information the program will email. The program will print a message describing the deficiencies in the other claims and will not attempt to email their information to the claimant.

Once I created the excel file, I saved it as ClaimsToday.csv.

Below is the code to import the csv file. We import and access the data using a reader. (If you were in Advance Webdesign last semester, we learned about this in that class, but using C#!) This code recognizes the first row of our data as fieldnames by default. After reading the file, I have a print statement to verify that my code worked the way I wanted it to.



Here's the result of the print statement:


Now that we have the data we need, in my next post I will be moving forward into showing you how we can check the data before sending out the email.


Sources:
https://docs.python.org/3/library/csv.html#csv.DictReader

Tuesday, April 3, 2018

Project Overview

Upon my graduation this May, I will be working at Progressive Insurance as a data analyst. From what I know, I will be working with insurance claims data in some capacity.

Throughout the next three posts, I will be building a Python program that does the following. This small program will simulate a large scale program that could be used by a major insurance company to be run daily and send out emails to claimants with the information for the claim they filed that day.

  1. The program will read a csv file ClaimsToday with claims information. Every row will hold information for a different claim. In reality, for a big insurance company, there would be thousands of rows, but my file will contain a small sample of claims. The various columns will hold the different pieces of information for each claim, including claim number, claimant first name, claimant last name, address, phone number, email, description, damage, location of accident, date of accident, time of accident, police involvement etc.
  2. Then the program will check the data for completeness and validity. Once the data is read into the program, we want to make sure that it is clean. Certain required fields must be filled, such as claim number, first and last name, and email. Other fields will have to be checked for formatting, for instance, if phone number is not a number or if claim number is not in the right format (we'll set the standard for a claim number to be six digits for the purpose of this exercise), the program should produce an error message to report the deficiencies in the data. Note that large corporations have entire departments dedicated to data cleansing, so we will be doing a very small example of this sort of work here.
  3. Finally, if the data is clean, the program will format a message with the information for each claim, which will be automatically sent to the email associated with the claim. I will be sending all emails to my personal email account for the purposes of testing the program.
My next blog post will cover the first step of this process, reading the data, so stay tuned!

Fun fact, I was Flo from Progressive this past Halloween! Until next time!





Tuesday, March 20, 2018

Dynamic Typing (and a new datatype!)

As mentioned before, Python is a dynamically typed language. In this post we will further explore the functionality of a dynamically typed language.

In Python, there is a cool module called "types" that defines the names of the different types of variables. We can explore this module to see how Python recognizes a variable type without it being declared by the programmer.

Identifying a datatype in Python takes a very small amount of code; this hearkens back to my previous blog post that talk about the conciseness of Python's language in one-liners of code. Because Python is dynamically typed, the "type" module is able to recognize any valid datatype for what it it.



An here is the output:



Dictionaries:

Now I will briefly introduce a datatype you may have not seen before. "A Python dictionary is a mapping of unique keys to values." Dictionaries sort of act like two lists joined together or a 2D array, where there is an association between the two parts. I am introducing dictionaries, not only because they are an interesting and useful datatype to know about and one I have not encountered before in other languages, but because I will elaborate on the above example where we recognize the datatypes of each variable to show that even more complex datatypes are recognized.

Here is the example again, with an example for a dictionary as well as a list included:


























Other than the possibly unfamiliar syntax of a dictionary, you might notice something else unfamiliar here. In this example, I declared name_number as a dictionary and primes as a list. Since Python is dynamically typed, I did not need to do this, but I wanted to demonstrate that it is still possible and could be useful to do, especially as your programs get more complicated.

I could have just as easily done the following:







Sources:
http://www.secnetix.de/olli/Python/dynamic_typing.hawk
https://medium.com/@ageitgey/learn-how-to-use-static-type-checking-in-python-3-6-in-10-minutes-12c86d72677b
http://www.pythonforbeginners.com/dictionary/how-to-use-dictionaries-in-python
http://developer.rhino3d.com/guides/rhinopython/python-datatypes/#tuple
Explore this last link to read about tuples, another datatype in Python. Tuples look and act very much like lists, but there are some important differences. I included this reference as supplemental reading as I feel it is important information, but does not necessarily relate to this post.

Sunday, March 18, 2018

Python One Liners

Many code structures that take multiple lines in a language like Java can be written using only one line of Python code. One liners of code save tons of space. They also show examples of Python's concise and structured language as I described in the first blog post.

We have already discussed how brief language is a special feature of Python, such as one line conditional statements. In this post, I will list some more examples of how Python uses one line of code to do things that take multiple lines in a language such as Java.

1. Swapping numbers

It's the tricky question that's bound to be on an exam in your coding 101 class. In Java, swapping numbers requires something like the following code, where a temp variable is created to store one of the variables before swapping:



In Python, however, we can simply use one line of code, rather than three to accomplish the swap. This is called 'unpacking for swapping variables':







2. Duplicating strings

In a language like Java, to duplicated a string, one would have to write code in a loop to concatenate a string:





But in Python, we can "multiply" a string like we would a number, all in just one line of code.





3. Store list values in separate variables

In Python, we can assign values in a list to variables using only one line of code.





Doing this in Java would require a loop to go through each position in the array. I will not show this code in Java, but you can imagine that it would take many more lines of code than we used in Python.

4. Combine items in 2 separate lists

Items in two lists can be joined in their respective positions using one line of Python code. The following is an example. Note that in this example, we use a two-liner for loop to 'zip' the lists together.








Doing this in another language would require looping through each list and concatenating the strings position by position, which would certainly take more than two lines of code.

5. Negative Indexing

Another unique feature to Python is negative indexing. Python recognizes negative indexes of a lists, and interprets them by counting through the list backwards. This is a cool feature that can be coded in one line, and it is not a feature in languages like Java.





And now for a (Monty) Python one-liner:
"My brain hurts!"
- Monty Python's Flying Circus

Until next week!

Sources:
https://www.codementor.io/sumit12dec/python-tricks-for-beginners-du107t193
http://sahandsaba.com/thirty-python-language-features-and-tricks-you-may-not-know.html