Building bridges between conventional business intelligence methods and cutting edge data-science - this is the Budapest BI Forum 2015 summarized in one line. Entrepreneurs, businesspeople, open source contributors and scientists got together in Hungary's capital, to spend three days attending hands-on sessions and talks. A perfect occasion for exchanging ideas, insights and experience with data.
Starting with a day packed with workshops on data-science topics, mostly from (but not limited to) the Python and R ecosystems, the Budapest BI Forum set the tone for the two days to come. Needless to say, the conference organized by Arato Bence had arranged a great set of speakers: Well-known members of the Python data-science community like Wes McKinney of Cloudera and Ian Ozsvald of modelinsight.io, young researchers like Valerio Maggio and Johannes Wachs, representatives of the R community (Francois Romain) and businessmen like Főző Csaba. The recurring theme of this conference is evident: a fruitful exchange of various neighboring communities.
Having returned from the conference, there are four talks, which I found especially interesting and would like to point out:
Ian Ozsvald: Shipping Data Science Products, Ian summarized his experience in shipping data-science projects. The gist: For data-science projects to succeed, data scientists must focus on automation and value delivery. He emphasized, that there is a difference between data science for the sake of research and development (prototyping) and engineering, which is focused on reliability and long term success. Ian also explained, how the Python ecosystem helps him ship projects every day.
Wes McKinney: Productive Python Analytics At Scale, The second talk in plenary session by the creator of Pandas, which is a popular Python library for analysing time-series data. Nowadays, Wes is more concerned with his recent project Ibis, a project at Cloudera which aims at tightly integrating industrial strength big-data databases with the scientific Python ecosystem. A reoccurring theme in McKinneys talks and writing is "The great decoupling", his observation of a general trend in big data analysis to separate concerns user interface, storage and computational engines on distributed big-data systems. This time he explained his thoughts in depth, which was a pleasure to watch.
Johannes Wachs: Analyzing Networks in Python presented a talk out of the ordinary. The analyses he conducts for his PhD at the CEU's Center for Network Science predict corruption cases in international public procurement and yield valuable insights in the social media interactions.
Főző Csaba: Transactional Data Mining at Lloyds Banking Group gave an outstanding presentation about the power of state-of-the art prediction methods that outplay traditional techniques.