Analyzing IBM Watson experiments with IPython Notebook
Authors: Bittner, Torsten, IBM
IBM's Emerging Technologies team was tasked with migrating the IBM Watson system that won the Jeopardy!-like game to a domain-independent codebase. This task started as a software engineering exercise and later became an information engineering exercise as we worked to optimize the system's question-answering ability for new domains. In this new paradigm the team would observe and measure a system behavior, such as its accuracy in generating candidate answers to a particular type of question, and then hypothesize what (software) change to the system would improve the behavior and how it would impact the original measurement. The team would then implement the change, re-run the system against a test dataset, analyze the gigabyte-sized test results to evaluate the difference in system behavior. By conducting many series of these experimental iterations, the team was able to significantly improve IBM Watson's question-answering performance.
This talk describes how we used the IPython notebook environment and the rich set of Python data science libraries (e.g. Pandas, NumPy/SciPy) to perform reproducible science, which resulted in improvements to IBM Watson's accuracy.