It was a little over three years ago that I got the opportunity to be one of the first founding members of the Open Source Office(OSO) at Capital One. The Open Source Office drives Capital One’s Open Source Strategy, defining the policies and procedures for safe adoption and contribution to open source projects. Since I joined the group, there has been no lack of exciting challenges in our efforts to help Capital One embrace open source! Let me tell you one story.
After joining the Open Source Office, I started looking for open source solutions to my work in a broader and purposeful way. While I had some previous exposure to Python, my appreciation for the language and the community really grew with my journey into open source.
The Early Days
In my early days as a developer, I punched Fortran cards and wrote pages and pages of code in C before moving on to client server development using PowerBuilder & Visual Basic. I specifically enjoyed working on the database side and had the opportunity to work with several RDBMSs that were prevalent in the 1990s and early 2000s.
In 2013, I had the opportunity to use Python SDK in my efforts to create a monitoring dashboard using Splunk. In all, it took me one three-day weekend to learn the basic Python language constructs and the SDK methods to produce an elaborate web report. The ease of using Python (with some help from Unix OS System calls) felt like I had moved from the hot and humid equator of using DLLs and registry settings to the cool breeze of the Caribbean. To this day, I continue to leverage Python for automating services that my team was doing manually.
Which brings us to 2015. In 2015, my first task in the Open Source Office was to develop an executive dashboard to provide a daily report on the success of Hygieia. Hygieia was Capital One’s first major open source project and we were excited to track its progress in the open source and DevOps community. (It’s a DevOps dashboard tool, you should check it out on our site.)
For this, I had to get metrics related to project usage from GitHub. Putting together this report was the perfect opportunity for me to expand my Python skills, as well as to leverage AWS. Having recently finished a project that made me wait for days to get six VMWare machines running, the pleasure of moving to AWS and spinning up an EC2 instance and a PostgreSQL database in a matter of a few seconds was immeasurable.
I designed a simple solution in Python to collect Github repo level details on issues, PRs, forks, etc. The program used requests package to call Github APIs. The data was then pushed into a PostgreSQL Database in AWS. Using automated program to get the data made it easy to produce the executive dashboard on a daily basis, or sometimes, even multiple times in a day.
Here are some of the decisions that went into it:
- Capital One has embraced the public cloud. So, I decided to use RedHat Linux EC2 because I was using a Mac and the OS was similar to Linux. Installing the required Python packages was quick and easy.
- My search for a database driver to talk to PostgreSQL was relatively easy because of Python’s popular repository namely pypi.org. While psycopg2 is a popular choice, I went with pg8000 due to the license that psycopg2 uses. Until I joined the Open Source Office I was using open source software with little to no awareness of the different types of open source licenses. But being in the Open Source Office helped me understand the differences between permissive and copyleft licenses and which one I should use on this project.
- PostgreSQL was new to me, but other than the slightly confusing setup of user roles and groups, it didn’t require much time to start working with the database. And I cannot give enough thanks to the maintainers of the JSON package. It was a breeze walking through the result set of the GitHub API and getting only the information that I needed.
- My biggest challenge in choosing the different components was to establish the connectivity for the requests package to make the call to the GitHub API. I struggled to understand the methods to jump over proxy and SSL verification. Once again, the documentation on the requests package, as well as the community posted queries and solutions, helped me to select the methods and code the calls.
Creating this report was an important milestone in my journey towards embracing Python. I realized that I could have used pyGithub, but my code provided the flexibility and the speed that I needed to get my data. Also, I re-used the packages down the road when I was working to produce dashboards to track the success of some internal projects at Capital One.
When I look back to those four weeks in 2015 when I went from a novice Python programmer with a few lines of code running on my laptop to having a complete little program running in AWS, I feel like it was a serious step in my journey as a developer using Python.
What I love about Python is its simplicity and the short learning curve to start delivering solutions. Aided with the copious help materials and the depth of open source packages, Python is a watershed for novice, as well as skilled programmers, seeking to automate tasks. The zen-like traditions of the Python community that encourage simplicity and readability made it easy for me to understand other open source code and encouraged me to use it more.
Python has use cases where its light weight provides a huge ROI. Python’s ability to integrate with several other languages, minus the complexities, makes it equally attractive to novice and expert programmers alike.
In the last four years, I had been very happy to tell this story of my leap into the world of Python and why the Python community is a great example of doing open source right. Python made me feel that I could reach my objective with just reading and research, without training from a costly program. The community support is so huge that I always feel that there are 100s of folks out there to help. I am happy to be writing in Python! I look forward to writing in it for many more years to come!