Reticulate is a package for R that allows you to run Python code inside of R. Since both Python and R are very popular for common data science tasks, it makes sense that you would want to use them together. In this talk, I'll demo how to run a Python package inside of R.
The Circuit Playground Express (CPX) is a small circuit board you can program with Python. But what if you want to make the lights blink and play sounds at the same time, while keeping the code for those two things separate? I'll briefly show you how to program the CPX with Python, then talk about how to run multiple pieces of code at the same time using generators. This talk was inspired by the closing keynote at PyCon 2019.
A spreadsheet is a wonderful invention and an excellent tool for certain jobs. All too often, however, spreadsheets are called upon to perform tasks that are beyond their capabilities. It is like the old saying, “If the only tool you have is a hammer, every problem looks like a nail.” But some problems are better addressed with a screwdriver, with glue, or with a swiss army knife.
Python is often called the Swiss army knife of the programming world, due to its versatility and flexibility in use. That is why it has become increasingly popular over time. Companies can adopt Python to perform some uniquely complex processes over the long-term.
During this talk, Ryan will discuss his firsthand account of Excel Hell and how he managed to escape it using Python. He will also discuss of the relevant libraries he uses for web scraping, data processing, analysis, and visualization, including Requests, Pandas, Flask, and Airflow, as well as few strategies he uses when approaching problems with data.
Ryan S. McCoy is a Data Engineer at gotem, LLC, where he is responsible for helping modernize the systems, data infrastructures, and analytics of companies primarily in the Financial Services industry, including Investment Managers, Hedge Funds, Venture Capital funds, and data vendors. Previously he spent a decade at several institutional investment funds located in St. Louis.
Github Repo -> https://github.com/ryansmccoy/spreadsheets-to-dataframes
pudb is a feature-rich terminal-based debugger that is a great alternative to Python's built-in debugger (pdb). This demo will demonstrate how to launch into the debugger, as well as how to use its remote functionality to connect to and troubleshot multi-process apps which do not run in the foreground.
Python has a great library for interacting with kubernetes (k8s) clusters. This talk will discuss two quick tools to get your feet wet when it comes to interacting with k8s using python and show you some of the things to look out for, as well as the basics of local vs intra-cluster security.
"the phone calls are coming from inside the house!"
The first service is a simple flask based application that will be running as a pod inside the cluster exposing the endpoint using a Service and Ingress resources. When you call the "/pod/versions" endpoint, it will return the versions of any applications running in the cluster as JSON. There are some security constraints built into k8s that you should be aware of when trying to access the k8s API internally. We will walk you through how to allow this service to access this API even with Role Based Access Control (RBAC) enabled using a ServiceAccount. This method will only grant this specific service inside a particular namespace read-only access to pod information for the cluster.
The second application will make use of this flask endpoint and be run from your local command. k8s config file to get access. We will then use it to compare a secondary application running in a different namespace. This is a smaller version of some real world tooling we use at Rally Health as we migrate from mesos to k8s and need to compare state between these two environments as well as between clusters in different environments. These techniques are just the tip of the iceberg, but ideally they should give you some idea as to what the kubernetes python client is capable of handling.
Building services is important, but what happens after they are built and running in production? How do we establish trust with our customers that our service will actually be available? Who creates these definitions and how do we measure them? Service Level Indicators (SLI), Agreements (SLA), and Objectives (SLO) are central to an operations mindset and foundational tools for effective Site Reliability Engineering. This talk will take you on a journey through Springfield as we discuss exactly what SLIs, SLAs, and SLOs are, how to measure them, what targets should be measured, how to define uptime, availability, and acceptable error rates, and what happens when they are breached. Attendees will leave with a clear understanding of how to monitor and report for their services, how SLIs, SLAs and SLOs can aid in this process, and how to implement them within their own teams.
Ray is a framework for distribution and scaling of clustered, high-performance, Python applications. It is used in several ML/AI systems and production deployments. This talk explains the problems that Ray solves, including rapid execution of “tasks” and management of distributed state, such as model parameters during training. I’ll use several example applications to illustrate. You'll learn when and how to use Ray in your projects.
Most computer languages offer "int"s and "reals" and maybe some support for "complex" or fixed point decimal. Python goes further. This talk will discuss built-in numeric types (such as Rational and Decimal), numeric types from Numpy, and the Abstract Base Classes that make it possible to add your own specialized numeric type and have it appear as part of the language.
Have you ever wondered how your computers knows what programs are running? What about what happens behind the scenes when you start a program? This talk will cover the basics of how processes work, and how your operating system keeps track of what's running. By the end of it you will know enough to write your own basic versions of 'ps' or 'top'.
The rapid growth of Python is, in part due, to it's exceptional toolkit for Data Analysts, Scientists, and Engineers. Packages like Pandas, Scikit-Learn, PySpark, and Dask have become staples for teams looking to process data. However, when processing large amounts of data there are times when Python might not be the right solution for your task. In this conversation, we'll learn about Cloud based Data Warehouses, such as Google's BigQuery, Amazon's Redshift, and Snowflake. You'll learn about the advantages of these platforms compared to in-memory processing in Python. We'll also show examples of how you can use Apache Airflow to automate recurring tasks, turning your Data Warehouse into the cornerstone of your Data Science infrastructure.