Lindsay Richman - JupyterCon 2017

Post written by Lindsay Richman:

Thanks to the generous support of Women in Machine Learning and Data Science, I had the pleasure of attending JupyterCon 2017. This insightful, well-run conference was held in New York City, and coordinated by O’Reilly Media. Major tech companies such as Amazon, Microsoft, and IBM were among the sponsors and presenters. I was very excited to attend, because I work at an edtech company, 2U. So while I use Jupyter Notebook for data analysis, I was also interested in learning about the open source community’s larger initiative, Project Jupyter, and its role in education. The conference probably had several thousand attendees, which was pretty surprising, even to the presenters! Here are some key developments and takeaways from the sessions:

Jupyter bridges and supports different programming languages. In his keynote presentation, Wes McKinney highlighted the fact that data scientists often work in a variety of languages, with R, Python, and Julia predominating. Jupyter, which started as IPython notebook, now works with all of these languages, and many others. Data scientists in a number of companies are making good use of this feature; Kyle Kelley of Netflix and Hilary Parker of Stitch Fix both gave great talks on how Jupyter has helped them unite the “Python” and “R” camps by giving them a common tool from which to work.

Jupyter allows people with little programming experience to get working with data quickly. My first foray into programming started by practicing Python in a Jupyter notebook; after a simple installation of Anaconda, I was ready to get started. In a similar fashion, Harvard Professor Demba Ba illustrated how he used Jupyter to introduce data analysis to students who had traditional engineering backgrounds, but limited exposure to computer science. Similarly, Vinitra Swamy and Gunjun Baid from UC-Berkeley gave a great presentation on how they used Jupyter to get over 2,000 undergraduate students, about half of whom had never programmed before, working in a data science environment.

Extensions like nbgrader, nbviewer, and RISE are powerful features that can be used easily within the notebooks. Previously, I had used Jupyter for data processing, and did not even consider that it might have other applications. Several sessions at this conference covered popular extensions and plug-ins that showcase Jupyter’s versatility. For example, after installing RISE, you can choose “Slideshow” from the cell toolbar, and assign a slide type to each cell in the notebook. Once you’ve created your slides, you simply have to press a small icon on the toolbar, and your slideshow is generated! Nbgrader enables instructors to create both auto and manually graded assignments, while nbviewer lets the user display Jupyter notebooks in a variety of formats, including HTML.

Jupyter supports dynamic content, which allows users to create interactive dashboards, graphs, and charts. One of the conference’s sponsors, Domino Data Lab, showed how they built data elements in a notebook that the user could interact with; for example, the user could pick any point on a graph, and get the corresponding value. Additionally, Daina Bouquin and John DeBlase presented IPySigma, a network visualization frontend for Jupyter Notebook that uses SigmaJS.

Your favorite O’Reilly book may have been created using Jupyter! Yes, that’s correct – Jupyter Notebook is now used in academic and commercial publishing. Many O’Reilly books are authored using Jupyter, including Introduction to Machine Learning with Python. The book’s author, Andreas Müller, discussed the upsides and challenges of the publication process in his presentation. In addition, O’Reilly gave conference attendees several free mini-books on topics like machine learning and valuable Python libraries, all of which were created using Jupyter.

Slides from the presentations are posted here.