The increasing sophistication of machine learning has enabled dramatic transformations of businesses and introduced new product categories. At Assembly AI they are offering advanced speech recognition and natural language models as an API service. In this episode founder Dylan Fox discusses the unique challenges of building a business with machine learning as the core product.
- Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
- Predibase is a low-code ML platform without low-code limits. Built on top of our open source foundations of Ludwig and Horovod, our platform allows you to train state-of-the-art ML and deep learning models on your datasets at scale. Our platform works on text, images, tabular, audio and multi-modal data using our novel compositional model architecture. We allow users to operationalize models on top of the modern data stack, through REST and PQL – an extension of SQL that puts predictive power in the hands of data practitioners. Go to themachinelearningpodcast.com/predibase today to learn more and try it out!
- Your host is Tobias Macey and today I’m interviewing Dylan Fox about building and growing a business with ML as its core offering
- How did you get involved in machine learning?
- Can you describe what Assembly is and the story behind it?
- For anyone who isn’t familiar with your platform, can you describe the role that ML/AI plays in your product?
- What was your process for going from idea to prototype for an AI powered business?
- Can you offer parallels between your own experience and that of your peers who are building businesses oriented more toward pure software applications?
- How are you structuring your teams?
- On the path to your current scale and capabilities how have you managed scoping of your model capabilities and operational scale to avoid getting bogged down or burnt out?
- How do you think about scoping of model functionality to balance composability and system complexity?
- What is your process for identifying and understanding which problems are suited to ML and when to rely on pure software?
- You are constantly iterating on model performance and introducing new capabilities. How do you manage prototyping and experimentation cycles?
- What are the metrics that you track to identify whether and when to move from an experimental to an operational state with a model?
- What is your process for understanding what’s possible and what can feasibly operate at scale?
- Can you describe your overall operational patterns delivery process for ML?
- What are some of the most useful investments in tooling that you have made to manage development experience for your teams?
- Once you have a model in operation, how do you manage performance tuning? (from both a model and an operational scalability perspective)
- What are the most interesting, innovative, or unexpected aspects of ML development and maintenance that you have encountered while building and growing the Assembly platform?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on Assembly?
- When is ML the wrong choice?
- What do you have planned for the future of Assembly?
- From your perspective, what is the biggest barrier to adoption of machine learning today?
- Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you’ve learned something or tried out a project from the show then tell us about it! Email firstname.lastname@example.org) with your story.
- To help other people find the show please leave a review on iTunes and tell your friends and co-workers
- Assembly AI
- Learn Python the Hard Way
- NLP == Natural Language Processing
- NLU == Natural Language Understanding
- Speech Recognition
- RNN == Recurrent Neural Network
- CNN == Convolutional Neural Network
- LSTM == Long Short Term Memory
- Hidden Markov Models
- Baidu DeepSpeech
- CTC (Connectionist Temporal Classification) Loss Model
- Grid Search
- K80 GPU
- A100 GPU
- TPU == Tensor Processing Unit
- Foundation Models
- BLOOM Language Model
- DALL-E 2
Support The Machine Learning Podcast
Predibase’s founders saw the pain of getting ML models developed and in-production, taking up to a year even at leading tech companies like Uber, so they built internal platforms that drastically lowered the time-to-value and increased access. The key was taking a “declarative approach” to machine learning, which Piero Molino (CEO) introduced with Ludwig, an open source framework to create deep learning models with 8,400+ GitHub stars, more than 100 contributors, and thousands of monthly downloads. With Ludwig, tasks that took months-to-years were handed off to teams in thirty minutes and just six lines of human-readable configuration that can define an entire machine learning pipeline.
Now with Predibase, we are bringing the power of declarative machine learning built on top of Ludwig to broader organizations with our enterprise platform. Like Infrastructure as Code simplified IT, Predibase’s machine learning (ML) platform allows users to focus on the “what” of their ML models rather than the “how”, breaking free of the usual limits in low-code systems and bringing down the time-to-value of ML projects from years to days.
Click here to learn more and try it for yourself!