Schedule - Day 1
2019-Nov-16
Start :
8:00
Stop :
9:00
Breakfast & Registration
2019-Nov-16
Start :
9:00
Stop :
10:30
Concert Hall
2019-Nov-16
Start :
10:30
Stop :
10:55
Talk room #1 (Concert Hall)
Techniques for algorithmic content creation have found uses in industries like film, music, and most notably video games. More importantly than that, they are extremely fun to play with. In this talk I will share procedural generation methods I have encountered in both hobby and professional projects, and how they can be implemented in Python. We will look at several methods with their applications in the generation of textures, 3D models, music, and outfits.
Talk room #2 (Sky Room)
Writing code in a Jupyter Notebook is an interactive process involving a lot of trial and error. As your code evolves, errors and bugs inevitably start to creep in. A debugger can help track them down. In this talk, we’ll go through some reasons why you may want to debug your notebook. Then, we’ll explore how you can debug notebooks with the ipdb debugger. Finally, we’ll see how we can use an IDE to track down those pesky bugs. Questions such as 'How do you debug code in a Jupyter Notebook?' and 'Can I edit and run the cells until I get what I intended?' will be explored.
PyData Track (Round Room)
What does a decade of news stories published on cbc.ca tell us about Canada? What words and ideas are associated with different cities, provinces, and public figures? Can it tell us who is Montreal's Drake or what is the Vancouver equivalent of poutine? Can it reveal unconscious biases? Are certain words more associated with the word 'man' than the word 'woman'? With 'black' versus 'white', 'indigenous', or 'immigrant'? In this talk, I'll show how I trained a neural word embedding model with hundreds of thousands of news stories using the gensim library and explored the word associations through a Jupyter notebook.
Tutorial (Clipper Room)
Susan Li walks you through deep learning methods for natural language processing (NLP) tasks using Python and open source libraries, using a live example. Methods include word2vec embedding, recurrent neural networks (RNN) and convolutional neural networks (CNN). This is a hands-on approach to framing a real-world problem to the underlying NLP tasks and building a NLP application using Deep Learning. If you are a data scientist or software developer with experience in Python who wants to develop natural language processing software, this talk is for you.
2019-Nov-16
Start :
11:15
Stop :
11:40
Talk room #1 (Concert Hall)
Have you ever wanted to move fast, but are too afraid to break too many things? Learn from the experience of deploying the largest python site in the world! It is deployed every every 7 minutes! This talk details the practical steps that we took to build our continuous deployment pipeline. We will highlight the technical challenges, discuss the tools, and share the cultural philosophy necessary. Plus, we'll explain how to recover when things break. The steps can be used by any development and infra team as a blueprint towards a continuous deployment system. Development and infra teams that already do some form of continuous deployment will identify with the problems and hear how we solved them. Others, will hopefully be inspired when they learn that they can start with the simple thing and progress step by step.
Talk room #2 (Sky Room)
The Python data science stack is composed of a rich set of powerful libraries that work wonderfully well together, providing coherent, beautiful, Pythonic APIs that let the Data Scientist think less about programming and more about the data. However, many of these libraries are largely single_threaded (e.g., Pandas, Scikit-Learn), and as data workflows grow larger, they quickly run up against this limitation. RAPIDS is a suite of open-source libraries that provide APIs nearly identical to existing popular Python libraries. By leveraging the massively parallel processing capabilities of GPUs, RAPIDS libraries can provide speedups of 50x or more over their purely-CPU counterparts. cuDF is a GPU DataFrame library following the Pandas API. cuML is a GPU Machine Learning library following the Scikit-Learn API. cuGraph is a GPU Graph Analytics library with an API inspired by NetworkX. This talk will provide an overview of the RAPIDS ecosystem, with a focus on the cuDF library, its features and design. We'll show how cuDF combines the use of Numba, Cython, modern C++, CUDA, and Apache Arrow to build a highly performant DataFrame library that is also highly interoperable with other libraries in the PyData ecosystem. We'll show examples of workflows using cuDF both on a single GPU, and across multiple GPUs in conjunction with the Dask library. We'll also share some performance results, best practices, tips, and tricks.
PyData Track (Round Room)
PyCon partnered with Autism Speaks in the year 2015 and conducted a 5K annual fun-run to raise awareness about Autism Spectrum Disorder. It’s great to see tech conferences supporting such a cause but wouldn’t it be incredible if we can leverage Python itself to understand autistic children and help them lead better lives? In this talk, we’ll cover the concepts of BioSensors, BioSignal Processing in Python and Machine Learning to find out how we can do just that. Since children suffering from ASD have difficulty in expressing and communicating, traditional methods of recognizing emotions through facial expressions or speech recognition tend to have very less accuracy. Hence, the technique of mapping emotions through physiological signals of the body using Biosignal Processing in Python is relatively new and un-explored which will be the basis of the talk. The talk aims to introduce various biosensors that can be used to capture the real-time physiological signals of the body, followed by the techniques involved in Biosignal processing using Python (Neurokit, BioSPPy) and finally conclude with explaining how Machine Learning algorithms like SVM and K-NN can help map these real-time signals to emotions.
Tutorial (Clipper Room)
Susan Li walks you through deep learning methods for natural language processing (NLP) tasks using Python and open source libraries, using a live example. Methods include word2vec embedding, recurrent neural networks (RNN) and convolutional neural networks (CNN). This is a hands-on approach to framing a real-world problem to the underlying NLP tasks and building a NLP application using Deep Learning. If you are a data scientist or software developer with experience in Python who wants to develop natural language processing software, this talk is for you.
2019-Nov-16
Start :
12:00
Stop :
12:25
Talk room #1 (Concert Hall)
Communicating dates and times with another person is not so simple. “See you at 6 o’clock on Monday” sounds understandable. But was it a.m. or p.m.? And was your friend in the same time zone as you when you said that? When we need to use and store dates and times on Python, we have similar and more intense issues since we can express a date and time in many ways. For example, “July 15, 2019 07:05 pm”, “2019-15-07 19:05:53 CDT”, “2019-07-15T23:05:53.256587-05:00”, and 1563231953, express the exact same date and time. However, the types used and format look very different. In this talk we'll explore the python datetime best practices to reduce the complexity when using, formatting, and storing datetimes on a daily basis.
Talk room #2 (Sky Room)
Large projects often use Kafka to provide durable real time processing over huge amounts of data. But, Kafka is nothing like Redis, SQS, or any Celery backend. We’ll look at how Kafka’s design makes it amazing at some tasks (real time durable processing) and awful at others (this isn’t a Celery backend for a reason). How to avoid having your Kafka clients sitting idle and ensure your messages are actually recorded in order and in real time! Finally, we’ll look at how a Pythonista can take advantage of Java-only tech like Kafka Streams with KSQL.
PyData Track (Round Room)
We've all seen beautiful data visualizations on the web and elsewhere. A good visualization can make a persuasive point or give new insights. How can you create beautiful and useful visualizations without too much effort? The secret behind each visualization is a set of well-organized data. In this case, we look at data from online surveys, which is not typically well-organized, and see how we can transform it and easily visualize it using the 'Grammer of Graphics' approach. The 'Grammer of Graphics' is a way of associating different data points and different aspects of a chart. You can provide the type of chart that you want, specify which data you want on the x and y axes and how you want to group you data and you will get a reasonable chart. As you do a data analysis, it is important to understand your data. A visualization tool that can quickly generate useful charts during analysis and also generate the finished charts for production is ideal. The altair
package excels at both these jobs. The altair
package allows you to create web ready visualizations using this approach using Python. This talk will demonstrate how altair
can be used with survey data to get quick insights out of a survey,or any other data source. Altair
is built upon Vega and Vega-Lite, which are JavaScript libraries. They work well with Jupyter notebooks and are useful for data exploration. The data behind the chart and the code for the chart itself is stored as JSON and can be included on any web page, so the visualizations are independent of Python.
Tutorial (Clipper Room)
This is an open space time and available for group meetings. Please sign up using the 2019 wiki
2019-Nov-16
Start :
12:30
Stop :
13:30
Lunch
2019-Nov-16
Start :
13:30
Stop :
13:55
Talk room #1 (Concert Hall)
Is your code running slower than you would like? If so, how do you even begin identifying performance bottlenecks? This talk will teach you how to profile your Python program and interpret flame graphs to find the best candidates for speedups. Through a real-life case study, we'll also see common performance anti-patterns, and simple remediation techniques. The case study will be a crypto-assets trading strategy backtesting program that was way too slow, and for which the techniques covered in this talk yielded amazing results, and even lead to improvements to third-party libraries!
Talk room #2 (Sky Room)
In this talk, I will give an overview and real-time coding demos of the power of Django+Wagtail for building intuitive content management systems. Wagtail is a framework built on top of Django that adds a whole new level of power to the Python platform. Whether you are familiar with Django or not, this talk is aimed at anyone looking to try out something other than the industry go-to of Wordpress while exploring the workflow. The talk includes an overview of features, common custom CMS issues that Wagtail handles gracefully, and code examples of creating page models and customizing the admin. I'll be showcasing a real world example of a solution we provided a client that had many complex requests. Through this exercise, I hope to illustrate the power through simplicity Wagtail provides.
PyData Track (Round Room)
Regardless of your business, being able to anticipate your users’ next action is a valuable advantage, whether that be a purchase, a view, or even a cancellation. Typical modelling approaches to predict users’ actions have focused on one specific action, e.g., conversion or churn. Here, we take a more holistic approach and don’t limit ourselves to one action. We model a users’ journey, so that we can not only anticipate a user’s action but also the one after that.
Tutorial (Clipper Room)
2019-Nov-16
Start :
14:15
Stop :
14:40
Talk room #1 (Concert Hall)
The Zen of Python, accessed by running import this
, is a list of nineteen aphorisms that have guided the development of the language. It has good advice for how to organize our code, but what does it have to say about how we organize ourselves? Plenty: the Zen of Python is not only a solid set of development principles, but the other easter egg is that it’s packed with wisdom about how to build healthy teams. In this talk I draw upon my time as an engineering manager leading Python teams to tell stories of what the Zen of Python has to teach us about communication and conflict, building inclusive teams and transparent processes, and promoting psychological safety. Come ready to reflect on and feel inspired by a new interpretation of these principles, and bring what you learn back to your meetup, study group, open source project, or team.
Talk room #2 (Sky Room)
Sometimes you need to scale the performance of your Python code, or you need to hook into a C API. Wouldn't it be nice not having to do that in C or C++? This talk walks through how to accelerate Python code using a binding written in Rust (a new safe, fast systems level programming language).
PyData Track (Round Room)
How can you use machine learning with python to detect situations that are weird, abnormal, or different? Anomaly detection may be your answer! In this talk, we’ll review some interesting anomaly detection applications across multiple domains (from healthcare to finance, to name a couple). Finally, we’ll walk through a practical example of anomaly detection, for use in an industrial setting, using deep-learning and TensorFlow 2.0.
Tutorial (Clipper Room)
2019-Nov-16
Start :
15:00
Stop :
15:25
Talk room #1 (Concert Hall)
Integrating with third-party services can be challenging. Network connections, API endpoints, and third-party Python client packages are often unreliable or poorly documented. Debugging production problems can be difficult because reproducing the issue is timeconsuming. In this presentation, I will share expertise learned from years of from building and maintaining integrations with 60+ services, including Salesforce, Intercom, HubSpot, Zendesk, Xero, NetSuite, and others.
Talk room #2 (Sky Room)
Over the last few years machine learning has drawn a lot of attention from both inside and outside the data science community. The internet is flooded with articles on the latest or coolest algorithms. What these articles often don’t cover is that at the beginning of your project, you'll be spending a lot of time collecting, cleaning and otherwise pre-processing your data, no matter what type of project or model you’re working on. There’s a tendency to dismiss this first stage as mundane, but this couldn’t be further from the truth. This first, exploratory, stage of the analysis is when you'll learn most about the information that is available for solving your problem and how to harness it. In this talk, I’ll use practical examples to describe some of the statistical techniques that I've found most useful over the years. For instance, box plots offer a simple way to detect outliers and inconsistencies. Others, like imputation, are more complex and can even leverage machine learning. These methods can be combined in multiple ways to create useful representations of data, making building a good model a whole lot easier.
PyData Track (Round Room)
This talk presents a systematic approach to understand and implement Genetic Algorithms, with a hands-on experience of solving a real-world problem. The inspiration and methods behind GA will also be included with all the fundamental topics like fitness algorithms, mutation, crossover etc, with limitations and advantages of using it.
Tutorial (Clipper Room)
This is an open space time and available for group meetings. Please sign up using the 2019 wiki
2019-Nov-16
Start :
15:25
Stop :
16:00
Afternoon Break
2019-Nov-16
Start :
16:00
Stop :
16:10
Talk room #1 (Concert Hall)
In this talk, we introduce the pyblitzdg module for physical model development and unveil some of the power that it puts into the hands of the scientific model developer. pyblitzdg
is a new open-source Python 3 extension module that provides bindings to the C++ modelling library blitzdg which incorporates the blitz++
tensor arithmetic library. Pyblitzdg
excels at carrying out fast simulations of wave dynamics in sophisticated geometries. With support for both Finite Volume (FV) and Discontinuous Galerkin (DG) numerical methodologies, a wide set of tools are made available to the model developer. Object-oriented programming is not required to use pyblitzdg
, and simple procedural-style simulation programs can usually be written in a single ~100 line python 3 script. The syntax used relies on NumPy and would be familiar to users of wide-spread mathematical software like Matlab or GNU Octave. Worked examples that are relevant to real-world physical problems will be shown, and future application areas and potential extensions will be revealed.
Talk room #2 (Sky Room)
Municipal departments all work together for the common good of our residents through our strategic plan, but we may not always be up to date on what other staff or departments are working on. The Planning Services department of the Municipality of Clarington has utilized in house expertise and infrastructure to build several apps that allow our department and other departments within the Municipality to view our current workload, collaborate on projects and stay organized. Three apps that have been the most successful at helping the departments work together will be demoed. The Planning Applications Portal, shows the ongoing planning applications at the municipality with reports and statistics. The GIS Portal that is used by the GIS professionals of the Municipality to stay organized, establish governance, collaborate and share with each other. Thirdly, the Geospatial Data Inventory which is a massive database of all the geospatial data that the Municipality manages, including visualization, data requests and metadata. After the brief demos we will walk through how to build one of these at your organization. They are all built with Python, HTML, CSS and JavaScript.
PyData Track (Round Room)
Machine learning algorithms are susceptible to both intentional and unintentional bias. Relying on biased algorithms to drive decisions can lead to unfair outcomes that have serious consequences affecting underrepresented groups of people. In this talk, we'll walk through examples of algorithmic bias in machine learning algorithms, explore tools (in Python) that can measure this bias, and discuss good ethics and software engineering strategies to mitigate bias in machine learning algorithms.
Tutorial (Clipper Room)
Django ORM is extremely powerful and is a vital part of the framework. Being simple and intuitive to use, often some of its advanced features remain unutilized, leading to inefficiencies. In this cookbook style tutorial, a selection of nontrivial query usecases will be addressed, each of which will introduce an ORM feature or two. Notable ones covered will be advanced querying techniques, database functions, performance optimization, and legacy databases. In this interactive session, participants would be encouraged to share knowledge or ask questions based on their real-life challenges related to the topic at hand. The session will end with a discussion on ORM features recently introduced in Django as well as upcoming ones.
2019-Nov-16
Start :
16:20
Stop :
16:30
Talk room #1 (Concert Hall)
When most of us think of agriculture, we don't think of it as a cutting-edge playground for AI, robotics, and data science development in Python. Over the years with the development of Data Science and AI methodologies, people have potentially been interested in applying these tools to different fields(finance, journalism, agriculture) to gain meaningful insights. I came across a method called Hydroponics which allows growing plants with water in the absence of soil. I was deeply interested and started to question the idea of attaching a few sensors to plants. Given the data, I began to explore the automation of the plant growth process. This talk focuses on performing exploratory data analysis and using neural networks with Python for implementing a smart and intelligent hydroponics system.
Talk room #2 (Sky Room)
There was a time that, no matter what language I was writing in, it always came out looking a little bit like FORTRAN. As I added on some OOP-ness and more modern languages, I eventually arrived at Python, a language which has a clear 'idiomatic' style, and its own adjective. As it happens, I learned Python and Italian at the same time. I came to Python already working with Ruby, and I came to Italian already speaking French. These languages that are superficially similar have led me to some strange accidents, like asking at the Vatican if the bathrooms were stopped, and never, ever being able to write a for-each loop without looking up the syntax. What I realized is that it is hard to learn to speak like a native (or really at all) when you have nobody to talk to. People care about PEP8, but a computer will keep merrily running along and not tell you that your code looks weird. Also, no matter how many times I say the same sentence to Duolingo, it's just not the same as finding myself making a mistake in the middle of, 'I went to Rome last year,' in my first conversation with an actual Italian. The following concepts will be touched on in this talk: The challenges of learning languages (spoken or programming) from an app, the benefits of letting the other languages languish while you pick up a new one, the importance of community (Exercism, Meetup, and the kindness of strangers) and the excitement of being able to talk about the weather.
PyData Track (Round Room)
We have always been taught that in the future when you book a flight, the cheaper it may be. What if I said 'It is not'? There is a minimum on someday before the flight. We are going to explore how historical airfare data can help find the best deals. The talk will focus on the whole process, including gathering the data, and creation of a basic neural network model. With advancements in deep learning in these few years, it is very easy to train a simple statistical model to predict the prices.
Tutorial (Clipper Room)
Django ORM is extremely powerful and is a vital part of the framework. Being simple and intuitive to use, often some of its advanced features remain unutilized, leading to inefficiencies. In this cookbook style tutorial, a selection of nontrivial query usecases will be addressed, each of which will introduce an ORM feature or two. Notable ones covered will be advanced querying techniques, database functions, performance optimization, and legacy databases. In this interactive session, participants would be encouraged to share knowledge or ask questions based on their real-life challenges related to the topic at hand. The session will end with a discussion on ORM features recently introduced in Django as well as upcoming ones.
2019-Nov-16
Start :
16:40
Stop :
16:50
Talk room #1 (Concert Hall)
The game engine, Ren'Py, is an open source engine used to make countless interactive fiction games, also known as visual novels. I learned to program in Python using this engine and have released a commercial game with it. The talk will dig into the source code of the engine, and explore topics such as how OS level stuff is handled for game developers, including memory optimization and cross platform game saves, plus more cool stuff.
Talk room #2 (Sky Room)
Cette présentation sera en français. Je vais parler de comment j'utilise les capteurs, les données et le Python pour m'aider et faire l'entrainment du triathlon avec le but de qualifier pour les championnats du monde du Ironman 70.3.
PyData Track (Round Room)
Calculate mel spectrograms from human voices using python and train an algorithm to compare human voices
Tutorial (Clipper Room)
Django ORM is extremely powerful and is a vital part of the framework. Being simple and intuitive to use, often some of its advanced features remain unutilized, leading to inefficiencies. In this cookbook style tutorial, a selection of nontrivial query usecases will be addressed, each of which will introduce an ORM feature or two. Notable ones covered will be advanced querying techniques, database functions, performance optimization, and legacy databases. In this interactive session, participants would be encouraged to share knowledge or ask questions based on their real-life challenges related to the topic at hand. The session will end with a discussion on ORM features recently introduced in Django as well as upcoming ones.