The Machine Learning Podcast

Detailed and technical explorations of machine learning and artificial intelligence with the researchers, engineers, and entrepreneurs who are shaping the industry

About the show

This show goes behind the scenes for the tools, techniques, and applications of machine learning. Model training, feature engineering, running in production, career development... Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.


  • The Role Of Model Development In Machine Learning Systems

    May 28th, 2023  |  46 mins 41 secs

    The focus of machine learning projects has long been the model that is built in the process. As AI powered applications grow in popularity and power, the model is just the beginning. In this episode Josh Tobin shares his experience from his time as a machine learning researcher up to his current work as a founder at Gantry, and the shift in focus from model development to machine learning systems.

  • Real-Time Machine Learning Has Entered The Realm Of The Possible

    March 9th, 2023  |  34 mins 29 secs

    Machine learning models have predominantly been built and updated in a batch modality. While this is operationally simpler, it doesn't always provide the best experience or capabilities for end users of the model. Tecton has been investing in the infrastructure and workflows that enable building and updating ML models with real-time data to allow you to react to real-world events as they happen. In this episode CTO Kevin Stumpf explores they benefits of real-time machine learning and the systems that are necessary to support the development and maintenance of those models.

  • How Shopify Built A Machine Learning Platform That Encourages Experimentation

    February 2nd, 2023  |  1 hr 6 mins

    Shopify uses machine learning to power multiple features in their platform. In order to reduce the amount of effort required to develop and deploy models they have invested in building an opinionated platform for their engineers. They have gone through multiple iterations of the platform and their most recent version is called Merlin. In this episode Isaac Vidas shares the use cases that they are optimizing for, how it integrates into the rest of their data platform, and how they have designed it to let machine learning engineers experiment freely and safely.

  • Applying Machine Learning To The Problem Of Bad Data At Anomalo

    January 23rd, 2023  |  59 mins 24 secs

    All data systems are subject to the "garbage in, garbage out" problem. For machine learning applications bad data can lead to unreliable models and unpredictable results. Anomalo is a product designed to alert on bad data by applying machine learning models to various storage and processing systems. In this episode Jeremy Stanley discusses the various challenges that are involved in building useful and reliable machine learning models with unreliable data and the interesting problems that they are solving in the process.

  • Build More Reliable Machine Learning Systems With The Dagster Orchestration Engine

    December 1st, 2022  |  45 mins 43 secs
    data orchestration, mlops

    Building a machine learning model one time can be done in an ad-hoc manner, but if you ever want to update it and serve it in production you need a way of repeating a complex sequence of operations. Dagster is an orchestration engine that understands the data that it is manipulating so that you can move beyond coarse task-based representations of your dependencies. In this episode Sandy Ryza explains how his background in machine learning has informed his work on the Dagster project and the foundational principles that it is built on to allow for collaboration across data engineering and machine learning concerns.

  • Solve The Cold Start Problem For Machine Learning By Letting Humans Teach The Computer With Aitomatic

    September 27th, 2022  |  52 mins 7 secs

    Machine learning is a data-hungry approach to problem solving. Unfortunately, there are a number of problems that would benefit from the automation provided by artificial intelligence capabilities that don’t come with troves of data to build from. Christopher Nguyen and his team at Aitomatic are working to address the "cold start" problem for ML by letting humans generate models by sharing their expertise through natural language. In this episode he explains how that works, the various ways that we can start to layer machine learning capabilities on top of each other, as well as the risks involved in doing so without incorporating lessons learned in the growth of the software industry.

  • Convert Your Unstructured Data To Embedding Vectors For More Efficient Machine Learning With Towhee

    September 21st, 2022  |  51 mins 53 secs

    Data is one of the core ingredients for machine learning, but the format in which it is understandable to humans is not a useful representation for models. Embedding vectors are a way to structure data in a way that is native to how models interpret and manipulate information. In this episode Frank Liu shares how the Towhee library simplifies the work of translating your unstructured data assets (e.g. images, audio, video, etc.) into embeddings that you can use efficiently for machine learning, and how it fits into your workflow for model development.

  • Shedding Light On Silent Model Failures With NannyML

    September 13th, 2022  |  1 hr 3 mins

    An interview with Wojtek Kuberski about the open source NannyML project and how it combines predicted performance of your model with observed outputs to identify silent model failures.

  • How To Design And Build Machine Learning Systems For Reasonable Scale

    September 10th, 2022  |  54 mins 9 secs

    An interview with Jacopo Tagliabue about how to design machine learning systems to support operations at the scale required by a majority of companies.

  • Building A Business Powered By Machine Learning At Assembly AI

    September 8th, 2022  |  58 mins 42 secs

    An interview with Dylan Fox about the unique challenges and potential involved in building a business with machine learning as the core capability that drives the product and the approach that he has taken at Assembly AI.

  • Update Your Model's View Of The World In Real Time With Streaming Machine Learning Using River

    August 25th, 2022  |  1 hr 15 mins

    An interview with Max Halford about the benefits of streaming machine learning for systems that need to learn continuously without being taken offline and how the River library supports building those models.

  • Accelerate Development And Delivery Of Your Machine Learning Projects With A Comprehensive Feature Platform

    August 6th, 2022  |  50 mins 37 secs

    An interview with Kevin Stumpf about the impact of a comprehensive feature platform on the development and serving of machine learning models and how they are addressing that need at Tecton.

  • Build Better Models Through Data Centric Machine Learning Development With Snorkel AI

    July 28th, 2022  |  53 mins 49 secs

    An interview with Alex Ratner about Snorkel AI's platform for data-centric machine learning development that accelerates the rate at which teams can build high quality training data sets with the help of domain experts

  • Declarative Machine Learning For High Performance Deep Learning Models With Predibase

    July 21st, 2022  |  1 hr 19 secs

    An interview with Travis Addair about the platform that he and his team at Predibase are building to empower everyone to build and deploy deep learning models in a low code approach for declarative machine learning development and how they are extending the capabilities of the open source Ludwig and Horovod frameworks

  • Stop Feeding Garbage Data To Your ML Models, Clean It Up With Galileo

    July 13th, 2022  |  47 mins 3 secs

    An interview with Galileo co-founder Vikram Chatterji about the challenges of managing unstructured data assets for machine learning projects and how their platform is designed to ease the burden of maintaining clean data sets