The Machine Learning Podcast

Episode Archive

Episode Archive

15 episodes of The Machine Learning Podcast since the first episode, which aired on June 3rd, 2022.

  • Build More Reliable Machine Learning Systems With The Dagster Orchestration Engine

    December 1st, 2022  |  45 mins 43 secs
    data orchestration, mlops

    Building a machine learning model one time can be done in an ad-hoc manner, but if you ever want to update it and serve it in production you need a way of repeating a complex sequence of operations. Dagster is an orchestration engine that understands the data that it is manipulating so that you can move beyond coarse task-based representations of your dependencies. In this episode Sandy Ryza explains how his background in machine learning has informed his work on the Dagster project and the foundational principles that it is built on to allow for collaboration across data engineering and machine learning concerns.

  • Solve The Cold Start Problem For Machine Learning By Letting Humans Teach The Computer With Aitomatic

    September 27th, 2022  |  52 mins 7 secs

    Machine learning is a data-hungry approach to problem solving. Unfortunately, there are a number of problems that would benefit from the automation provided by artificial intelligence capabilities that don’t come with troves of data to build from. Christopher Nguyen and his team at Aitomatic are working to address the "cold start" problem for ML by letting humans generate models by sharing their expertise through natural language. In this episode he explains how that works, the various ways that we can start to layer machine learning capabilities on top of each other, as well as the risks involved in doing so without incorporating lessons learned in the growth of the software industry.

  • Convert Your Unstructured Data To Embedding Vectors For More Efficient Machine Learning With Towhee

    September 21st, 2022  |  51 mins 53 secs

    Data is one of the core ingredients for machine learning, but the format in which it is understandable to humans is not a useful representation for models. Embedding vectors are a way to structure data in a way that is native to how models interpret and manipulate information. In this episode Frank Liu shares how the Towhee library simplifies the work of translating your unstructured data assets (e.g. images, audio, video, etc.) into embeddings that you can use efficiently for machine learning, and how it fits into your workflow for model development.

  • Shedding Light On Silent Model Failures With NannyML

    September 13th, 2022  |  1 hr 3 mins

    An interview with Wojtek Kuberski about the open source NannyML project and how it combines predicted performance of your model with observed outputs to identify silent model failures.

  • How To Design And Build Machine Learning Systems For Reasonable Scale

    September 10th, 2022  |  54 mins 9 secs

    An interview with Jacopo Tagliabue about how to design machine learning systems to support operations at the scale required by a majority of companies.

  • Building A Business Powered By Machine Learning At Assembly AI

    September 8th, 2022  |  58 mins 42 secs

    An interview with Dylan Fox about the unique challenges and potential involved in building a business with machine learning as the core capability that drives the product and the approach that he has taken at Assembly AI.

  • Update Your Model's View Of The World In Real Time With Streaming Machine Learning Using River

    August 25th, 2022  |  1 hr 15 mins

    An interview with Max Halford about the benefits of streaming machine learning for systems that need to learn continuously without being taken offline and how the River library supports building those models.

  • Accelerate Development And Delivery Of Your Machine Learning Projects With A Comprehensive Feature Platform

    August 6th, 2022  |  50 mins 37 secs

    An interview with Kevin Stumpf about the impact of a comprehensive feature platform on the development and serving of machine learning models and how they are addressing that need at Tecton.

  • Build Better Models Through Data Centric Machine Learning Development With Snorkel AI

    July 28th, 2022  |  53 mins 49 secs

    An interview with Alex Ratner about Snorkel AI's platform for data-centric machine learning development that accelerates the rate at which teams can build high quality training data sets with the help of domain experts

  • Declarative Machine Learning For High Performance Deep Learning Models With Predibase

    July 21st, 2022  |  1 hr 19 secs

    An interview with Travis Addair about the platform that he and his team at Predibase are building to empower everyone to build and deploy deep learning models in a low code approach for declarative machine learning development and how they are extending the capabilities of the open source Ludwig and Horovod frameworks

  • Stop Feeding Garbage Data To Your ML Models, Clean It Up With Galileo

    July 13th, 2022  |  47 mins 3 secs

    An interview with Galileo co-founder Vikram Chatterji about the challenges of managing unstructured data assets for machine learning projects and how their platform is designed to ease the burden of maintaining clean data sets

  • Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks

    July 5th, 2022  |  48 mins 40 secs

    An interview with Shir Chorev and Philip Tannor about model validation and testing with the open source deepchecks library and the challenges of testing machine learning projects

  • Build A Full Stack ML Powered App In An Afternoon With Baseten

    June 28th, 2022  |  46 mins 26 secs

    An interview with Tuhin Srivastava about how the Baseten platform allows data scientists and ML engineers to build a full stack machine learning powered application by themselves in an afternoon

  • Introducing The Show

    June 3rd, 2022  |  1 min 11 secs

    Introducing the new podcast about how to go from idea to production with machine learning