Raising the next generation of data scientists with Jan Carbonell from Strive School
10 January, 2021
Jan Carbonell is an engineer, maker, entrepreneur, data scientist and an educator. From starting a study group to improve his own learning experience, today Jan leads the AI program at Strive School, a YC-backed startup raising the next generation of data scientists across Europe.
Hi Jan, let's just start with you telling us a bit about yourself. What is your background and how did you get into data science?
I initially started in industrial engineering and then realized that wasn't my passion. I wanted to create instead of optimize so I realized that coding would be more relevant in that regard. I did learn a lot of things through industrial engineering, but if I had the chance to choose my degree again, I would probably go for either computer science or mathematics. But I guess I am also a living proof that you can just make the wrong choice initially and still succeed in your chosen field. So how I would define myself is I'm an imposter in the data science world, because I don't have a CS background and I've learnt most of it on my own and then joined a master's in AI when it was full of CS people. But I'm very much a data scientist and an entrepreneur. Those are the things that I identify with today.
Can you tell us more about Strive School? What's the story and your mission?
We sold our startup akademy.ai to Strive School. Strive just came out of YC and had a really bold mission of training the next generation of data scientists and software engineers all over Europe. Data science is not a field that is accessible to everyone today. Many people could and should pursue it, but if we look at the data, it's mostly white males from great universities.
AI is a really powerful tool. But it needs a specific application, and it also needs to be critically reviewed. One of the ways in which I think we can empower fairness in AI is by ensuring there's a great diversity of people in the field. There are a lot of people in the world that are not taking the leap into AI, often because they've never had the right resources. I really believe that fostering a platform that gives them the right resources, the right mentorship, and the right knowledge can help many more people get into the field.
How do you go about fostering diversity in your selection process at Strive School?
It's been a long journey. We've come to this model by failing a lot of times. Before Strive School I launched a non-profit together with Miguel, my co-founder, focusing on teaching people AI, called Saturdays.ai. The idea was that people had a free day a week to learn, so we could use that day, Saturday in this case, to empower and facilitate said learning. Tons of people signed up all around the world, which actually surprised us. For me, that was the first indication that a lot of people from diverse backgrounds were super passionate about this field. And that is the major filter for scouting AI students - passion, perseverance and hard work - not whether they have a programming or mathematics background.
The passion really is important. Our approach is not - "Hey, we're going to teach you AI or machine learning", but - "we're going to teach you AI in order to do X". We have people say things like, "I'm a doctor and I really want to use AI to see if I can improve detection in medical imaging" or “I am a linguist and I want to use AI to improve real-time translations”, etc. In that case, they are not only interested in the lessons in a general sense, they put a lot of passion into their projects because they have a clear goal they want to achieve with AI as the means.
"We can enable fairness in AI is by ensuring diversity of people in the field."
What was that passion in your personal data science learning journey?
I tried to learn programming several times and initially, I failed each time. At first, I would get my programmer friends to come to hackathons with me. Then I would always end up only doing the pitches, HTML pages and serving the coffee, so I wasn’ t learning. After that, I was working for a consulting company and we were surveying the state of the automotive industry. I got really into researching self-driving cars, Tesla in particular, and I realized that projects like that one are the future of AI and are going to revolutionize the field. Being passionate about a specific AI application is what really pulled me in and motivated me to learn. Later, I came across a famous essay from Marc Andreesen on why software is eating the world. And then I realized maybe someone hasn't written the essay yet, but AI is going to eat the next world and I want to be a part of it rather than just be a user.
"AI is going to eat the world and I want to be a part of it rather than just the end user."
Can you share more about your learning journey? Are there courses you did, books you read or people that helped you upskill?
The AI field has a lot of content. Everyone has this next great book that they're trying to recommend to you, when there's a lot of old things that really work. I took one programming course that really helped me out which was CS50x from Harvard, it's one of the most well-known programming courses. The passion of the teacher was incredibly inspiring to me and it's something that I try to emulate as a teacher myself.
After that, I bought the AI Bible Artificial Intelligence, A Modern Approach (Artificial Intelligence, A Modern Approach). It's a great source of information if you know where to look. I had joined a couple of courses like fast.ai and deeplearning.ai which are the traditional ones, but I often found myself lost at 3AM in the morning when I was fighting a problem. I knew that even if I asked a question on an AI forum, I would get the answer maybe a couple of days later, which didn’t help me.
So that was how the idea for the non-profit started. We said "let's create something like a study group, let's get a set of people that want to learn AI together". The pressure that it put on me is that I had to be accountable. In 24 hours we sold out all the tickets. We had the funds to pay for coffees and food, we had companies knocking at our door saying that they wanted to sponsor the event and we had a community going on and we had to give the resources or community. That is what motivated me to learn quickly, so that I could help others learn.
Soon after, I did an AI masters which was a really eye-opening experience. I was used to very high-quality learning from the online courses that I and my AI friends carefully selected for AI Saturdays. In the Master’s, I asked myself "why are you teaching algorithms that are from 2008 and are no longer relevant? Why are we doing this in Java?" It was very disappointing to me as a student to see how outdated the traditional teaching approach was, especially since it is held in very high regard. I wanted to maximize my learning and I felt like I was not getting that.
What would you recommend to people starting out in AI today?
Definitely a programming course like the CS50x to just get some fundamentals in programming. There are some great books out there like Artificial intelligence with Python and Machine learning with Scikit-Learn, Keras, and TensorFlow. I think a book is a great asset, you go through the code, you execute it, you figure out if that works or not.
There's another great resource called distill.pub, which is a set of publications where they get a paper from machine learning and they break it down. I think that I would go even deeper on those types of resources. There's another resource called papers with code. And if someone is in a Spanish-speaking country, I really hope that they come to the Saturdays.ai because we're really working hard to remove all the barriers to start.
My biggest recommendation would be to follow your passion. If you have passion for something, try to develop a project around it with AI; this will guide your learning. I think going after your passions is something that people should do more often. Don't do a course for its credentials. If you're going for the academia-based approach, you definitely have to get more into papers and state-of-the-art and focus more narrowly. If you're just curious, figure out what you're most curious about and where you can apply your passion. That will guide your learning in a natural way.
For someone interested in Strive School, can you give an overview of the application process and what sort of things they should expect?
The Strive School is a bootcamp that takes 6 to 8 months. I can speak for the AI program, but we also have programs for web development and full-stack. The course is built for 6 months, but we also understand that some people take a longer time. We take people that don't have any programming background and are willing to work hard. You're doing the same amount of coding that you would do in a 2-year Master's condensed into 6 or 8 months.
At Strive, we want to select people that we know could learn and succeed on their own. We want to accelerate their path and make sure they get the best learning possible. First of all, it's about getting you the industry knowledge that you need to get a job in the field. You will learn about what is the state-of-the-art in machine learning, in deep learning and understand how to apply that in a real project. The learning is very much project-based - 2 - 3 hours of lectures every morning and 5 - 6 hours of exercises. Each day builds upon the next one.
The admission process is very much “tell us why you're interested”; that's our first filter. Then we have another interview where we try to figure out what are your long-term motivations, expectations, and how you work in a high-pressure environment. We also care about the cohort structure, so it may be that you are the right fit for Strive, but you're not the right fit for this current cohort, so we might recommend a delay in the start. What we do is we bet on the success of our students and we offer them whatever resource we think will help them succeed. I perform the roles of a teacher, a psychologist, a mentor, a guard, a friend. I do whatever needs to be done to make sure that everyone that goes through the program succeeds.
"Figure out what you're most curious about and where you can apply your passion."
Can you say more about the income-sharing model that works at Strive School?
An income-sharing agreement is something that was pioneered in a Nobel Prize in 1956 and nobody really paid any attention at the time. The whole idea is - let's not charge people based on what they can earn now, but based on their future income. Basically, we are going to charge you on the increase of your future income that we are going to provide. If I'm going to help you increase your income, then I deserve a piece of that future income. But if I fail to do so, I also should not collect anything because I didn't provide anything.
Let me use an example - if somebody happened to be a baker before they went to Strive School and then finished and continued to be a baker, our education has been worthless. As an educational institution, we can say we do many things but we haven't successfully helped someone get a job in the field. But if someone is a baker and then ends up working as an analyst at Goldman Sachs or a data scientist in Facebook, they have significantly doubled or tripled their salary. The income sharing is set at around 10% and we set a maximum so that the total is kept to a reasonable amount. This model incentivizes meritocracy. It's no longer that education is only for people that can afford to take a 2-year gap from working or people that have the savings, it creates access for people who might not have the resources upfront and then they can pay back once they get the job.
What is the best thing that happened to you in 2020?
I'm part of a network accelerator, a group of people called Celera. When coronavirus started spreading, someone from the group suggested that we could try to do something about the shortage of ventilators. One person on the team was a doctor and started breaking down all the things that are involved in building a medical-grade ventilator. We wanted to try to replicate it. I was in Denmark at the time, went to the electronics store, bought an Arduino and we started replicating the project simultaneously in Barcelona, Madrid and in Copenhagen.
We had great leadership from the founder of the accelerator and a diverse team with varying expertise. We developed a successful low-cost ventilator, moved into animal testing and had it approved for human trials. We got a donation of 250,000 euros to buy the components and had the government purchase some of our prototypes. Soon enough, we developed 50 devices.
Just seeing how we managed to come together as a team and develop this emergency solution so quickly was amazing and gave me a lot of inspiration. We ended up sending these 50 low-cost prototypes to Latin America where there was a great need for them. A medical-grade ventilator costs 20,000 USD and we were able to build ours for 2,000 USD and send it to the frontlines. That was incredibly inspiring for me in 2020.
What was the favorite book you read this year?
There are two. The first is Man who solved the market. It's about Jim Simons and how he created his multi-million-dollar fund. He's basically a data scientist that has cracked Wall Street. I really enjoyed reading about the amount of time, effort and knowledge it took him to succeed. He didn’t just rely on some quick schemes, he went very deep into the mathematics behind the stock market dynamics. I found it a really interesting read that I wasn't anticipating.
I also liked another book called Why Nations Fail. It was a very interesting read in the context of 2020 because we have definitely seen some failing forms of government or at least of governance. It was interesting to see how this replicated across history and which are the things that are likely to happen over and over again just because how we behave as humans.
What is your favorite Python library?
If I had to pitch things that really helped me out, I'd highlight pytorch and TensorFlow. I remember first learning TensorFlow and I wasn't really getting it, and then it just clicked and it works so well. I think keras is also great because it helps me not have to deal with SQL which I've definitely run away from and it is beautiful for web scraping, so when someone doesn't want to give you the data you can just take it and use it. These two could be a fun set for every data scientist.
Thanks for that plug opportunity, we actually recently shipped a pre-installed libraries feature for Deepnote. All these libraries you mentioned come pre-installed in Deepnote, all you have to do is import them into your project.
We actually have our students testing out Deepnote today, we're doing some web scraping projects on IMDb. You asked me what tools I like personally, and I prefer Deepnote for collaboration, you're doing a great job here. Jupyter notebook is not a tool built for teams and neither is VScode. There should be another standard in terms of the way that data science process is done, and I think that Deepnote is doing really good work in that, so I really look forward to using the tool more.