Meghan Blanchette

Natural language annotation for machine learning

James Pustejovsky and Amber Stubbs on machine learning best practices.

James Pustejovsky (@jamespusto) is an O’Reilly author and professor of computer science at Brandeis. Amber Stubbs (@amber_stubbs) is an O’Reilly author and post doc at SUNY Albany.

We sat down to talk about natural language annotation as it relates to machine learning. James and Amber reviewed methods, best practices, and what they see coming in the future.

Highlights from the conversation include:

  • Learn why it is important to create your own corpus for machine learning. [Discussed 20 seconds in.]
  • Discover different methods for creating a corpus. [Discussed at the 6:15 mark.]
  • Understand the MATTER Annotation Development Process. [Discussed at the 9:58 mark.]
  • Hear what James and Amber see coming next for machine learning. [Discussed at the 15:23 mark.]

You can view the entire interview in the following video.

Comment |

The future of MongoDB

Steve Francia on alternatives to Hadoop and what lies ahead for MongoDB.

Steve Francia (@ spf13) is an O’Reilly author and chief evangelist at 10gen.

Steve and I sat down during the Strata + Hadoop World conference in New York last month to talk about what he’s most excited about nowadays.  He focused on alternatives to Hadoop, what we can expect to see next from MongoDB, and the future of big data.

Highlights from the conversation include:

  • Discover alternatives to Hadoop. [Discussed 18 seconds in].
  • The new features in MongoDB 2.2. [Discussed at the 1:23 mark].
  • How being an open source company helps 10gen connect with its users. [Discussed at the 3:09 mark].
  • Long-term goals for MongoDB. [Discussed at the 5:10 mark].
  • New technologies are enabling all of us to participate in big data. [Discussed at the 7:05 mark].

You can view the entire interview in the following video.

Read more…

Comment |