Author Archives: Sean Murphy

About Sean Murphy

Sean Patrick Murphy, with degrees in math, electrical engineering, and biomedical engineering and an MBA from Oxford, has served as a senior scientist at Johns Hopkins University for over a decade, advises several startups, and provides learning analytics consulting for EverFi. Previously, he served as the Chief Data Scientist at a series A funded health care analytics firm, and the Director of Research at a boutique graduate educational company. He has also cofounded a big data startup and Data Community DC, a 2,000 member organization of data professionals. Find him on LinkedIn, Twitter, and .

Former Obama For America and Living Social Data Scientists Show Off Their Startups – Data Innovation DC – Next Week!

Welcome Back! As a few people have mentioned, DIDC has been missing in action for January and February and, for that, we must apologize. We had an amazing sequence of events planned for the last two months that fell through … Continue reading

Posted in Announcements, Community, Data Innovation DC | Tagged , , | Leave a comment

The Fall of the P-Value

We at Data Community DC wanted to highlight a very interesting and relevant article for data practitioners published over at For most people, P-values are the “gold standard” by which the validity of scientific results are measured. However, mounting … Continue reading

Posted in Community | Tagged , | Leave a comment

Flask Mega Meta Tutorial for Data Scientists

Introduction Data science isn’t all statistical modeling, machine learning, and data frames. Eventually, your hard work pays off and you need to give back the data and the results of your analysis; those blinding insights that you and your team … Continue reading

Posted in Languages, Methods, Python, Resources | Tagged , , | 1 Comment

Expanding the Online Presence of Data Community DC – W3DC’s Strategic Plan for 2014

by Sean Murphy & Benjamin Bengfort W3DC handles online and technological aspects of Data Community DC. Its primary scope is the web domain at as well as content and applications. Because of this, its natural responsibilities fall into several … Continue reading

Posted in Community, DataBlog | Tagged , , , | Leave a comment

A Tutorial for Deploying a Django Application that Uses Numpy and Scipy to Google Compute Engine Using Apache2 and modwsgi

by Sean Patrick Murphy Introduction This longer-than-initially planned article walks one through the process of deploying a non-standard Django application on a virtual instance provisioned not from Amazon Web Services but from Google Compute Engine. This means we will be … Continue reading

Posted in Commentary, Methods, Python | Tagged , , , , , | 1 Comment

Is Statistics the Least Important Part of Data Science?

There is a fascinating discussion occurring on Andrew Gelman’s blog that some of our Data Community DC member’s might want to chime in on … or discuss right here on our blog. There’s so much that goes on with data … Continue reading

Posted in DataBlog | Leave a comment

PyDataNYC 2013 – A Summary of a Fantastic Conference for Data Community

PyData NYC 2013 was a two-day conference this past weekend (Saturday and Sunday, 11/9 and 11/11) with a day of tutorials on Friday. Saturday and Sunday featured keynotes each morning and three tracks of talks and workshops. JP Morgan graciously … Continue reading

Posted in Commentary, Reviews | Tagged , , | 3 Comments

“Ten Simple Rules for Reproducible Computational Research” – An Excellent Read for Data Scientists

Recently, the journal PLOS Computational Biology published an excellent article by Geir Kjetil Sandve, Anton Nekrutenko, James Taylor, and Eivind Hovig entitled the “Ten Simple Rules for Reproducible Computational Research.” The list of ten rules below resonates strongly with my … Continue reading

Posted in Announcements, Community, Micro | Tagged , , , , | 2 Comments

Public Service Annoucement: Think Twice About Upgrading to OSX Mavericks If You Use R Studio

Upgrading to any new operating system can be problematic. However, if you depend on R and R Studio for your living, I would highly recommend NOT upgrading to OS X Mavericks, despite its nonexistent price tag. Personally, I have seen … Continue reading

Posted in Announcements, Commentary, Community, Statistical Programming DC | 1 Comment

Women in Data – A Special Event on Monday, September 23rd

Data Innovation DC is excited to bring you a very special event: Women in Data.  As many members may have noticed, tech-oriented events, including many offered by Data Community DC, tend to have male-dominated audiences. While much has been said … Continue reading

Posted in Announcements, Community, Data Innovation DC | Tagged , , | Leave a comment