Building Data Apps with Python on August 23rd

Python_Building_Data_Apps

Data Community DC and District Data Labs are excited to be hosting another Building Data Apps with Python workshop on August 23rd.  For more info and to sign up, go to http://bit.ly/V4used.  There’s even an early bird discount if you register before the end of this month!

Overview

Data products are usually software applications that derive their value from data by leveraging the data science pipeline and generate data through their operation. They aren’t apps with data, nor are they one time analyses that produce insights – they are operational and interactive. The rise of these types of applications has directly contributed to the rise of the data scientist and the idea that data scientists are professionals “who are better at statistics than any software engineer and better at software engineering than any statistician.”

These applications have been largely built with Python. Python is flexible enough to develop extremely quickly on many different types of servers and has a rich tradition in web applications. Python contributes to every stage of the data science pipeline including real time ingestion and the production of APIs, and it is powerful enough to perform machine learning computations. In this class we’ll produce a data product with Python, leveraging every stage of the data science pipeline to produce a book recommender.

What You Will Learn

Python is one of the most popular programming languages for data analysis.  Therefore, it is important to have a basic working knowledge of the language in order to access more complex topics in data science and natural language processing.  The purpose of this one-day course is to introduce the development process in Python using a project-based, hands-on approach. In particular you will learn how to structure a data product using every stage of the data science pipeline including ingesting data from the web, wrangling data into a structured database, computing a non-negative matrix factorization with Python, then producing a web based report.

Course Outline

 The workshop will cover the following topics:

  • Basic project structure of a Python application

  • virtualenv & virtualenvwrapper

  • Managing requirements outside the stdlib

  • Creating a testing framework with nose

  • Ingesting data with requests.py

  • Wrangling data into SQLite Databases using SQLAlchemy

  • Building a recommender system with Python

  • Computing a matrix factorization with Numpy

  • Storing computational models using pickles

  • Reporting data with JSON

  • Data visualization with Jinja2

After this course you should understand how to build a data product using Python and will have built a recommender system that implements the entire data science pipeline.

Instructor: Benjamin Bengfort

Benjamin is an experienced Data Scientist and Python developer who has worked in military, industry, and academia for the past eight years. He is currently pursuing his PhD in Computer Science at The University of Maryland, College Park, doing research in Metacognition and Active Logic. He is also a Data Scientist at Cobrain Company in Bethesda, MD where he builds data products including recommender systems and classifier models. He holds a Masters degree from North Dakota State University where he taught undergraduate Computer Science courses. He is also adjunct faculty at Georgetown University where he teaches Data Science and Analytics.

The following two tabs change content below.

Tony Ojeda

Manager, Data Analysis and Strategic Solutions at Follett Higher Education Group
Tony Ojeda is an accomplished data scientist and entrepreneur with expertise in business process optimization and over a decade of experience creating and implementing innovative data products and solutions. He has a Masters in Finance from Florida International University and an MBA with concentrations in Strategy and Entrepreneurship from DePaul University. He is the founder of District Data Labs, a co-founder of Data Community DC, and is actively involved in promoting data science education through both organizations.
This entry was posted in Announcements, Events, Python and tagged , , . Bookmark the permalink.