Supercharging Developer Productivity With Advanced Code Search


Google & Facebook Engineers are able to search over their gargantuan codebase using an internal code search engine. The search engine accelerates the ability of their developers (both old & new) to understand any part of their codebase & start contributing immediately! What about the rest of us?

In this talk, I'll walk through the different types of code search, tools & software, and advanced tips & tricks to navigate any type of codebase easily. With the advent of large code repositories and sophisticated search capabilities, code search is increasingly becoming a key software development activity. Every developer in the world that spends an insane amount of time reading rather than writing code should have access to the best code search tools that amplifies their productivity. The audience will walk away with everything they need to confidently onboard, navigate & understand any small, medium-sized and daunting codebase.


All right. So, if you're on, just follow me. So, I'm Prosper Otemiwa, I work as a developer advocate at Sourcegraph. We have an agenda. So, every developer in the world spends a huge amount of time reading and writing code. In fact, if you use GitHub a lot, you discover that even when you're on your laptop, you're on GitHub, you're trying to get notifications for pull requests, comments, and every other thing you're doing with code. Right? So, if you spend more time reading and writing, then you should have the tools that allow you to search for code very easily. So, we have a tool called Sourcegraph. That's why I said you should open your phones. If you open your phones, go to I like to call it the Google of code search. So this is how it looks like. If your interface doesn't look like that, you're not on Sourcegraph, please. Just cross check. Right? So, with Sourcegraph, this is the value we offer to developers. Sourcegraph has currently indexed over 2.1 million open source repositories across GitHub and GitLab. So, right now, you can search code on Sourcegraph, and it gives you data from GitHub and GitLab. Right? And you can also set private code across several repositories. With Sourcegraph, the same way you can do local search on your local IDs, you can have precise code intelligence on And with Sourcegraph, there's a feature we have called batch changes. Instead of automating, instead of opening several pull requests to different repos, with batch changes, you can have a file and it can make several pull requests for you instead of you having to do that yourself. So, it's called batch changes. And then we have two features called code monitoring and code insights. But for this talk, I'm going to be talking more about the search. All right. So, like I said, Sourcegraph can be called the Google of code search. It's literally a search engine that allows you to search all of open source code and all of your private code. Right? So, what are the code search patterns we have with Sourcegraph? There are three types of search you can do right now, literal, regular expression, and structural. So, let's go to literal. I know many of us are familiar with the source-based component. If you're a new developer or you're trying to see occurrences of a particular class, symbol, or definition, you can copy the code base and paste it into Sourcegraph and it goes ahead to search for you. Right now, this is me placing the source components in Sourcegraph and it searches all of the open source code it has indexed across GitHub and GitLab and it tells me examples and results of people that are using source components across this GitHub and GitLab. So now you can see the responses. You can scroll down. On the left hand side, you can see all the filters and operators that you can add. So, you can search through divs, you can search through the time difference. You can search through the committers of the code. I can have access to how people are using certain APIs. Another API that you can talk about in ReactJS is the unstable batch of this. So you want to know how many people are using unstable batch of this in the world? How many people are using it and how are they using this particular API method? With Sourcegraph, all you need to do is paste it in. If you look at the arrow, it searches within the ReactJS organisation, it searches all of the repos within the organisation, and it brings you out results of every usage of unstable batch of this. So, from within the React code itself, and within every other example within the React organisation, you can just get results. Now let's go to regular expression. How many of us are familiar with regular expression? I know we probably don't use it every day, but with Sourcegraph, it allows you to flex your muscles with regular expressions. If you understand what you're trying to look for, you're trying to look for something within a file, for example. So, here, I was trying to look for how people are using some of the latest versions of React by using regular expression to search via the package.json of several repositories. So I pasted this into the Sourcegraph search engine. I toggled on the regular expression mode, and this is the results it gives me. So with this result, you can see, and you can narrow down to get the results of how many people or how many projects are dependent on a particular version of React. So some of us here are open source maintainers, some of us are open source authors. You really want to get an idea of how many people are using your open source package, or how many people are using certain versions, maybe because you need to deprecate certain methods or maybe because there is a new architectural change you need to work on. So with this, you can easily scroll and see how people are using it, Webpack, RemixRun, Visualiser. You can see how many people are really dependent on a particular version of your project. So this is an example of me searching through branches, right? I was searching through the Sourcegraph repository, and then specifying with a regular expression to search within every branch that has mchap within the name of the branch. So this goes into all of the branches that has mchap within it, and then searches for Ubuntu latest. With that, it returns all the codebase for me. This is just open source code. And then connect all your private repositories as well. If you sign up and connect your private repositories, it's the same result it gets you. It searches within the scope of all the repositories that you've connected to Sourcegraph. And then we have the structural search. I think this is one of my best types of search. With structural search, you can literally take blocks of code. So you can take if conditions, try to catch a full block of code and paste it into the Google search engine, into the Sourcegraph search engine, and it's going to give you the results you're looking for. So this is me searching through the Google organisation on GitHub and looking for this block of code. If not path.length, and you have the three dots. So it goes ahead and looks for all the types of code that fulfils this particular condition and returns them to me. So now I can see all of the projects and all of the codebases, all of the files that have this within them, and then I can keep narrowing my search. There are so many keywords that you can use. In fact, we have a boot out there with a cheat sheet, so I encourage you after this talk or when you're free to come out to our boot and get the cheat sheet. There's a lot of things to learn there. So why you actually need Sourcegraph? Developer velocity. As your team is growing, you want, you know, as you build more features, you want your developers to move very fast. So with Sourcegraph, if you have all your repositories connected to Sourcegraph, it allows them to search stuff, right? Some people, we get into a company, and within the first week, they're already committing code. They can easily get the codebase, right? For some people, it takes three weeks because you have to go through different repositories, you don't know what you're doing. With Sourcegraph, everything is in one place, and it allows your developers to get on quickly to the codebase. And then with refactoring, there is no case of I left this behind. I forgot. I know you have tests. Yeah, that's very good. But when you have shared systems, for example, you have a shared design system they're using across different repositories, sometimes you forget that a particular component or something is used in this other part of the codebase or in another repository. With Sourcegraph, once you paste that in, it brings out all the references from all the different repos, no matter how many repos, 100, 200, 1,000, connected to it, and you get all those references and you move fast. Now a recap of the search parameters. Again, there are different keywords you can use with Sourcegraph, right? This is just a very good illustration of some of those keywords, right? You have the repo keyword that allows you to search within a particular repo or within an organisation of repos. Then you also have the branch where you can specify the branch to limit and narrow the search. You also have the type symbol. With type symbol, what happens is instead of returning the code with the function call, it just returns the definition of the class or the definition of the function with type symbol. So, with file 2, you can limit it and say, hey, I want you to just search within the config files or within the package files. And then with type commit, you can say, hey, I want you to search within a particular commit or within a div. Then you can specify the lang, for example. If somebody is trying to learn about lambda functions in Python, you can type in lambda and narrow your search to Python and it will give you all the ways lambda function has been used in the world or within your private codebase. If you want to learn more about code search, I implore you to visit is our entire suite of learning tutorials for everything that has to do with regular expression search, structural search, and everything that you can do with code search. And then we have more resources. We have an active presence on Dev 2. We have our blog. We also organise what we call Dev 2 time. So Dev 2 time happens biweekly. So if you go to our YouTube channel and search for Dev 2 time, you can see the amazing set-ups of different developer workspaces. And then you can search our podcast as well. So all these resources, you can use it to learn more about code search and about Sourcegraph. So thank you very much. And then you can follow me on Twitter and on GitHub and ask me as many questions as you want. Thank you.
10 min
22 Oct, 2021

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic