Monthly Archives: April 2013

Toward A Better Definition of “Big Data”

While many have tried, the term “big data” lacks a true consensus definition. At the moment the most popular definitions seem to coalesce around the idea that big data is one or more data sets so large and complex that … Continue reading

Posted in Commentary | 5 Comments

Weekly Round-Up: Ford’s Data, Apple’s iWatch, Wavii’s Acquisition, and Fighting Malaria

Welcome back to the round-up, an overview of the most interesting data science, statistics, and analytics articles of the past week. This week, we have 4 fascinating articles ranging in topics from how Ford is leveraging data to improve their … Continue reading

Posted in Round-Ups | Tagged , , , , | Leave a comment

Data Visualization: From Excel to ???

So you’re an excel wizard, you make the best graphs and charts Microsoft’s classic product has to offer, and you expertly integrate them into your business operations.  Lately you’ve studied up on all the latest uses for data visualization and … Continue reading

Posted in Data Sources, Data Visualization DC, DataBlog, Javascript, Methods, Python, R, Resources, Reviews, Shiny, Visualization | Tagged , , , , , | 1 Comment

Big Data Infrastructure – DataBusinessDC Meetup FollowUp

Thank you! Thank you for everyone who attended last night’s Big Data Infrastructure meetup! A special thanks goes out to all of the organizers including Robert Vesco, Ben Bengfort, and Josh Hurd and everyone at 1776 who let us use their … Continue reading

Posted in Data Innovation DC | Leave a comment

Amazon EC2 versus Google Compute Engine, Part 4

It has been a while since we have talked about cloud computing benchmarks and wanted to bring a recent and relevant post to your attention. But, before we do, let’s summarize the story of EC2 versus Google Compute Engine. Our … Continue reading

Posted in Projects, Reviews | Tagged , , , | Leave a comment

Data Science MD Joins the Data Community

In four short months, Data Science MD has grown from a mere idea to a 450+ member group hosting meetups around Maryland. When we were originally forming the group, one of our goals was to help create a thriving local community … Continue reading

Posted in Announcements, Community, Data Science MD | Leave a comment

Weekly Round-Up: Probabilistic Programming, Tech Startups, Data Viz Elements, and Super Mario Bros.

Welcome back to the round-up, an overview of the most interesting data science, statistics, and analytics articles of the past week. This week, we have 4 fascinating articles ranging in topics from probabilistic programming to machines playing video games. In … Continue reading

Posted in Round-Ups | Tagged , , , , , | Leave a comment

Resources and Readings for Big Data Week DC Events

We asked presenters at the various Big Data Week events here in the DC area to send us any books, articles, or blog posts that they recommend that are related to their presentations. We hope you find this list of … Continue reading

Posted in Events, Resources | Tagged , , , | Leave a comment

Data Visualization: The Data Industry

In any industry you either provide a service or a product, and data science is no exception.  Although the people who constitute the data science workforce are in many cases rebranded from statistician, physicist, algorithm developer, computer scientist, biologist, or … Continue reading

Posted in Community, Methods, Reviews, Tutorials | 1 Comment

A Survey of Stochastic and Gazetteer Based Approaches for Named Entity Recognition – Part 2

This is part 2 of a two-part series. Part 1 is here. Approaches to Named Entity Recognition Generally speaking, the most effective named entity recognition systems can be categorized as rule-based, gazetteer and machine learning approaches. Within each of these … Continue reading

Posted in White Paper | Tagged , , , | 1 Comment