ENTRIES TAGGED "bayes"
One of the chapters of Think Bayes is based on a class project two of my students worked on last semester. It presents “The Red Line Problem,” which is the problem of predicting the time until the next train arrives, based on the number of passengers on the platform.
Here’s the introduction:
In Boston, the Red Line is a subway that runs between Cambridge and Boston. When I was working in Cambridge I took the Red Line from Kendall Square to South Station and caught the commuter rail to Needham. During rush hour Red Line trains run every 7–8 minutes, on average.
When I arrived at the station, I could estimate the time until the next train based on the number of passengers on the platform. If there were only a few people, I inferred that I just missed a train and expected to wait about 7 minutes. If there were more passengers, I expected the train to arrive sooner. But if there were a large number of passengers, I suspected that trains were not running on schedule, so I would go back to the street level and get a taxi.
While I was waiting for trains, I thought about how Bayesian estimation could help predict my wait time and decide when I should give up and take a taxi. This chapter presents the analysis I came up with.
Sadly, this problem has been overtaken by history: the Red Line now provides real-time estimates for the arrival of the next train. But I think the analysis is interesting, and still applies for subway systems that don’t provide estimates.
An interview with Allen Downey, the author of Think Bayes
When Mike first discussed Allen Downey’s Think Bayes book project with me, I remember nodding a lot. As the data editor, I spend a lot of time thinking about the different people within our Strata audience and how we can provide what I refer to “bridge resources”. We need to know and understand the environments that our users are the most comfortable in and provide them with the appropriate bridges in order to learn a new technique, language, tool, or …even math. I’ve also been very clear that almost everyone will need to improve their math skills should they decide to pursue a career in data science. So when Mike mentioned that Allen’s approach was to teach math not using math…but using Python, I immediately indicated my support for the project. Once the book was written, I contacted Allen about an interview and he graciously took some time away from the start of the semester to answer a few questions about his approach, teaching, and writing.
How did the “Think” series come about? What led you to start the series?
Allen Downey: A lot of it comes from my experience teaching at Olin College. All of our students take a basic programming class in the first semester, and I discovered that I could use their programming skills as a pedagogic wedge. What I mean is if you know how to program, you can use that skill to learn everything else.
I started with Think Stats because statistics is an area that has really suffered from the mathematical approach. At a lot of colleges, students take a mathematical statistics class that really doesn’t prepare them to work with real data. By taking a computational approach I was able to explain things more clearly (at least I think so). And more importantly, the computational approach lets students dive in and work with real data right away.
At this point there are four books in the series and I’m working on the fifth. Think Python covers Python programming–it’s the prerequisite for all the other books. But once you’ve got basic Python skills, you can read the others in any order.