If you have followed the news in any capacity over the last few years, you have undoubtedly heard about LLMs, and the companies building these models. Though the consumer-facing models from companies such as Open AI have captured the hearts and minds of the public, enterprises have different concerns–things such as data privacy, and model reliability.
Enter Cohere, a company developing LLMs to address these issues. Started in 2019, Cohere partners with companies such as McKinsey, Oracle, and all the hyperscalers, and boasts a valuation of $5.5 billion. I sat down with one of the co-founders, Ivan Zhang, to talk about his journey.
TLDR; Key Takeaways
If you only have a few minutes, here are a few golden nuggets from my conversation with Ivan:
- Sales should be customer-centric. This seems obvious, but have you ever started a sales conversation by talking about your product? Instead, try starting the conversation by asking your customer about their pain points, and then addressing them.
- Data will be a bottleneck for AI development. Not in the sense that we will run out of internet to train on, but rather that there are types of data which we don’t have a lot of published datasets for.
- Your startup doesn’t have to be in the Bay to succeed. Cohere is headquartered in Toronto. This works for them because VCs are more open to remote work after the pandemic and Toronto has a lot of budding technical talent.
- A key signal to look for when hiring is mastery in any domain. Learning how to master something is fairly transferable. Masters in one domain can likely become masters in another.
- Being the lone Asian in a leadership room takes some time to adjust to. Getting more reps in has helped Ivan get used to his position.
Tell me about your upbringing, what pushed you to become a hacker?
I was born in Shenzhen, China, and moved to Toronto when I was eight. As a kid, I was always on the computer, even in China, playing the most random arcade games. I don’t want to paint Asians with a broad stroke, but if you’re a male Asian person who has been to high school in the past decade, more likely than not, you’ve played a ton of video games. I did that a lot, and I feel like it built a digital tolerance and stamina for sitting in front of a screen for long hours. Gaming was a big part of why I’m so drawn to building things, working really hard, and doing crazy hours in front of a screen.
When I got to university, I discovered the hackathon community. Every weekend, we would just meet up at random Ontario universities and sometimes U.S. universities. We would build for the whole weekend. Gaming addiction primed me to get really good at software engineering. I spent my entire first year going to hackathons, building stuff on the side. I had this realization that I could be spending 18 hours playing League of Legends, or I could use the same energy to build stuff, which is also very fun and not as frustrating as a solo queue. It felt like a natural progression.
How has your role within Cohere changed as the company has grown?
When you’re in a startup, your role tends to evolve a lot. That’s been especially true for my time at Cohere. When we started, I was hands-on coding every day, building some of our early systems and pipelines. Eventually, the company grew, and we hired people. Some of our founders had to step up and do some people management.
Eventually, we hired people managers as well, so I have to do less of that. Nowadays, I’m much more settled. I’d say I spend probably 50% of my time still doing technical stuff like machine learning and data processing. For the other 50%, I spend time with customers, either building new relationships, fostering existing ones, or really bringing in business. I think that’s a big part of a founder’s job, like founder-led sales.
How was the transition to doing sales? Did that come naturally to you?
As someone who spent most of my time as the most backend of backend developers, it was definitely uncomfortable at first. When you think about sales, you might think of a business guy with a suit trying to convince someone to give you money for stuff. But when I started doing it more and more, it ended up being about genuinely trying to learn more about your customers, asking them good questions, and figuring out what issues they have. It’s honestly the most important thing you could do. Why are you building anything if you don’t know what your customers want?
Sales has been super fun because it informs my technical work. I’m starting to see patterns across these conversations, which help inform what we should focus on. Another aspect is that as a founder, when you’re doing sales, you’re meeting executives and other founders as well–so you’re building good professional relationships. For anyone who’s new to it, you should expect some discomfort. But, like anything, you just learn to do it.
Why did you guys decide to focus on enterprise?
The enterprise market is often seen as boring, but actually, most of the world’s efficiency will come from automating a lot of these back-end use cases that don’t necessarily need human potential. A lot of these tasks we’re automating don’t need a well-educated human being sitting at a computer for eight hours a day. We’re much more interested in unlocking that human time for something else. Enterprise is where a lot of this kind of work is done. For example, Oracle is one of our customers, and they’ve deployed our models into their NetSuite features. So, a ton of HR and ERP use cases are now powered by our models. Our mission is scaling intelligence to serve humanity, and what better way to do it than to target the people doing this kind of back-end work at scale?
We also liked the idea of not having to suck up all of our customers’ data to train these models. In enterprise, we actually ship our models on-prem and we air gap them, so we don’t see any of our customers’ data. In those kinds of scenarios, we have to build our models slightly differently. We have to teach our models to use the enterprise’s internal knowledge bases, use the information returned from these knowledge bases, and then write an answer and cite it. This is all RAG (retrieval augmented generation), which is what we’re focused on because we deploy these models in this air-gapped environment where we can’t continuously suck up people’s usage data. We still have an API and a demo, but the majority of our deployments are on-premise.
What are some areas of AI, whether infrastructure or application, that you think more people should be building in?
I think there will be a bottleneck in terms of training data, but not in the way it’s currently been communicated to the public. Some people paint a picture that we’ll run out of the internet to scrape. I think that’s a naive understanding of how these models work and how to train them. In terms of capturing digital signals—thoughts, demonstrations, audio, video, images, sensory data—there’s going to be a big bottleneck in capturing data from the world that’s not already represented on the internet. The internet does a good job of capturing data and it grows exponentially every year, but there’s information about the world that we experience on a daily basis.
Do you mean information that isn’t easily digitally representable, such as smell?
Yeah, so smell for example–we’d have to build this hypersensitive chemical detector. So if we want autonomous interactive robots in the real world that can smell, there is an information problem we have to solve for the robots, right? They don’t have a nose; we don’t have a great digital nose for them. And I’m sure there’s other stuff.
Why have you decided to keep Cohere headquartered in Toronto instead of the Bay Area?
The three of us co-founders just really like living in Toronto. We grew up here, we have roots here. We also realized that it is actually a pretty great city for A.I. At some point in time, most of the people leading A.I. in the Valley came from Toronto. They came from Jeff Hinton’s lab at the University of Toronto. So there’s a ton of great early talent here. And I think because of how the VC attitude has shifted since the pandemic, investors are investing everywhere. They aren’t restricting investments to the Valley. Remote work is a thing now. And so luckily for us, we’ve been able to raise successfully. We’ve been able to build good models.
I’ve heard that you don’t pursue brand-name resumes when hiring. What are some signals that you look at instead of pedigree that indicate someone could be a great hire?
I’ve thought about this a lot. I try to think about my own trajectory and places that have rejected me and what signals they didn’t see. What it comes down to is that I look for folks who are extremely skilled at something. Once you’re skilled at something, you can take that anywhere else. Sometimes we want to hire more experienced folks, for sure. But especially in a field like ours where disciplines such as pre-training haven’t been around long enough for anyone to have much experience, you want high-energy, high-autonomy individuals who are excited to learn, experiment, and fail fast.
I try to ask a few questions that probe at stuff related to this. Like, “What’s the coolest thing you’ve worked on?” Then I get to see if they’re going into a lot of detail and are really excited about this random thing. “What’s the hardest you’ve worked on anything?” I want to see if they can go the equivalent of 16 hours a day, seven days a week when they’ve actually gone to bat for something. And then obviously, experience in the field is definitely a big plus, but honestly, the best hires we’ve made have just been high-energy, hardworking, genuinely curious people who are also skilled at software engineering.
How has being Asian impacted your founder’s journey?
I’m not ignorant of it. I definitely know it’s a thing. For example, on my board, I’m the only Asian guy on it. But I’ve been fortunate in that, growing up in Toronto and starting this company in Toronto, it’s not that weird to be an Asian guy in tech. So on the technical front, I don’t see it as a hindrance whatsoever. But it’s definitely been a factor for my own comfort in leadership situations–being the only person that looks like me in certain situations. But honestly, once I got the reps in, being in those situations, it was totally fine. And I think there are a few things that, depending on how you were raised, could affect you in doing business at the leadership level here. But generally, once you get the reps in, it doesn’t even matter.
Is there anything you’d like to shout out?
I’d like to shoutout my wife, she’s a big inspiration for me. Shoutouts to my mom, my dad, my sister. I think my parents taught me what it means to work really hard. I think this is true for many of us immigrants, watching our parents and grandparents work the most menial labor jobs just so we can be fed and go to school. It’s super inspiring. It makes me think: Why am I not working harder? Why am I not thinking about how I can level up and help them retire? My family has been a big inspiration to me. Lastly, shoutouts to Cohere and my co-founders. It’s been super fun, and I’m privileged to be working on this tech. I think it’s genuinely the most interesting thing I could be doing right now.
If you are high-energy, hardworking, and highly curious about AI, please consider applying to work with Ivan at Cohere! You can check out available positions at this link.
Thank you for joining us, Ivan!