Talk

ML and LLM in production

Thursday, May 29

11:05 - 11:35
RoomTortellini
LanguageEnglish
Audience levelIntermediate
Elevator pitch

From laptops to self-driving cars, ML deployment has exploded beyond the cloud. Add LLMs to the mix, and you’re not just monitoring model drift – you’re preventing AI from suggesting glue as pizza topping. Welcome to the wild new world of ML in production.

Abstract

The job of data scientists and AI devs is pretty tough. It often starts from a paper with almost no runnable-code, or from an off-the-shelve model that has to be optimized or fine tuned.

But that’s not enough. Models might need to be trained or deployed on specific hardware (GPU, IoT and mobiles), and performance might drift over time.

During this talk I’ll share my experience of leading platform and AI service teams, how we tackled some of the major problems you might find along your AI journey like training, serving and monitoring ML models.

We start from a data scientist’s laptop and talk on how to make her successful, covering both capabilities that you should provide and which tools might help.

The last part of the talk is dedicated to LLMs, how they are different from a serving and monitoring perspective compared to traditional ML models, and how to start using them in production.

TagsMachine-Learning, Applications, DevOps and SRE
Participant

Christian Barra

Christian Barra is a Software Engineer, Tech Lead and international speaker living in Lisbon. He’s the co-founder of ZeroBang, a cloud consulting company. He is an active member of the tech community in Berlin, conference organiser and a Python Software Foundation Fellow. You can follow him on twitter @christianbarra