Understand Your Codebase to Innovate Faster


The transition from a monolithic architecture to a service based approach began a number of years ago. The advantage of a service based approach is to increase agility and shorten the build and release development cycle. And more recently the rise of micro frontends means that this approach not only encompasses backend application but also frontend development. However one side effect is increased complexity of the codebase. Resulting in challenges in terms of visibility into how components relate, who owns a component and the impact of activities on dependent components. This session will touch on these challenges and what is needed to maintain individual developer velocity.



How to use the Power BI to make your business more secure. How to use the Power BI to make your business more secure. How to use the Power BI to make your business more secure. How to use the Power BI to make your business more secure. How to use the Power BI to make your business more secure. So that's giving you my, or perhaps Sourcegraph's perspective in terms of how we can improve developer efficiency by helping people understand your code base more effectively. So, in terms of preparing for this talk, I was thinking about, you know, I've been in sort of software development technology for some time. And I was sort of thinking about the sort of the, I suppose the big changes over the last, well, 10 years, probably more than 10 years, that have occurred and how they're relevant to software development and how we actually deliver software applications, and how we can actually accelerate that process. So there here's my list of five, your list may be different. So, you know, open source software, you know, when I started in software development, essentially, you know, the APIs, libraries you had available to you were essentially the ones that shipped with the operating system, or came with the development tool or platform you'd invested in. Now we have libraries from, obviously, front end, we're at a React conference, of course, to back end in terms of providing databases, search tools, you know, many, many libraries, and perhaps the challenges in terms of deciding what library or framework you wish to use. And of course, there have been big changes in terms of how people go about building out software applications. So we sort of gone away from sort of the kind of big bang approach of sort of the kind of waterfall type of methodologies where we, you know, we capture our requirements, you know, that which takes some time, you know, I worked for some time for a research company, we did a lot of work for large government organizations, we suspend huge sort of requirement documents that we had to either read or prepare. So we've moved to a more sort of iterative approach, we'll be sort of smaller chunks in terms of in terms of the capabilities we want to deliver. And that's great, because it allows us to essentially shrink that loop between building and delivering applications or capabilities, additional functions, and getting feedback from our stakeholders, or clients, customers. So a much more sort of iterative, fast based approach. So essentially, you know, allows us to actually correct direction much earlier in terms of the lifecycle of a particular application or solution. And of course, you know, the move to cloud where we're, you know, again, from a software perspective, you no longer have to worry about getting ahold of infrastructure or networking as sort of kind of a base level, where you can actually go out and get someone to deliver you a database service, or provide middleware on demand without having to actually worry about how to configure or deploy or manage it. And obviously, then in terms of again, that's providing additional building blocks, in terms of actually building and delivering software solutions. And, you know, I perhaps sort of actually say there's a parallel between sort of cloud computing in some sense and open source software. Again, they're kind of fundamentally about sort of providing additional building blocks to allow software developers to sort of focus on their business problems, their business functions and capabilities that actually, that way they can really add value to their customers. And the last two in terms of continuous integration delivery can very sort of align to sort of agile development. Again, the focus is in terms of testing, which you've kind of frequently delivering software, or delivering and deploying software much more rapidly, again, shrinking that loop, but also enabling us to actually test on with smaller changes, getting additional feedback more quickly. Ideally, when you do run into problems, that sort of set of changes that you perhaps have to evaluate in terms of finding the root cause is smaller. And then finally, in terms of service based architecture, so we've got sort of big change in terms of changes in terms of how we actually build software from an architectural standpoint. So from sort of client server to three tier to these sort of multi-service type of applications. And again, where the goal is to improve or increase encapsulation and decouple applications from application logic. So again, making it much easier potentially to make changes within our code, because your changes have less impact or limited impact on other services, etc. Because you're sort of sitting behind some sort of service based API. And that's relevant in terms of front end applications as well, where there's sort of a perhaps a move to sort of allow people to create sort of micro front end applications again, where we have a set of composable application components, which we can sort of stitch together very easily. So I'd say, you know, obviously, all of these changes, initiatives, obviously, have had a sort of an impact in terms of how fast, how quickly we can actually develop software. But if we actually sort of think about the software development lifecycle, it's not sort of one big loop. Many people sort of tend to think of it as essentially an outer loop and an inner loop. So the outer loop is essentially, you know, the releases, the projects, the sprints that we're actually undertaking. So it's more of a sort of a team effort in some sense. And I probably argue that in terms of the five changes we had above is, whilst they definitely actually improve that sort of development lifecycle, they're more focused on that outer loop. Certainly in terms of you think about agile development, continuous delivery. I mean, that's their sort of square in terms of trying to improve the efficiency of that outer loop. But they have benefits in terms of that inner loop. And the inner loop is really around what you're doing as an individual developer. So, you know, OK, I have to make some change to you. I have to add a new capability. I have to fix some problem that appeared within our sort of test suite. There's an incident and I have to investigate that. Or I'm trying to sort of remove some form of technical debt within an application or set of services. So that's really the focus in terms of the actual sort of individual developers when you actually get to that sort of, you know, you're actually writing the code. You're actually closer to the source of the code. And of course, what we all want to do is basically sort of ensure that, you know, we're as efficient as we can within that inner loop. You know, once we have a great understanding in terms of the task ahead of us, once we have some kind of full context, we can really sort of, you know, we understand the tech stack that perhaps dealing with at that point in time. We can be kind of very efficient, very productive. So, you know, we know exactly what we're doing. We can get away. We're in front of our ID and we can actually write our code, test it, iterate and do it in a sort of a kind of a very efficient manner. Obviously, there are challenges in terms of and I suppose the key challenges people face is essentially, you know, interruptions in terms of that, in terms of that flow where you have to sort of switch context. So, you know, most organizations have you, we have sort of planned interruptions where we have, you know, we have meetings in terms of to update people on progress. But we have interruptions in terms of people asking us questions, other members of our team asking us questions. And of course, you know, there may be things we don't understand. You know, we don't have full context. We don't have full understanding. And we have to then go away and actually try and sort of fill in the gaps with our knowledge in order to achieve the task we're looking at. And really, it's in terms of, you know, how can we actually minimize those interruptions so everybody sort of has context and sort of has that sort of kind of, you know, kind of fast flow in terms of how they're writing software. So, you know, it's one of a sort of a cartoon that one of my colleagues pointed me out. So in terms of in terms of actually how we think about how we how we write software in terms of sort of another sort of talking about that that inner flow. So, you know, obviously, when we're writing software, we have some sort of blue sky thinking, you know, we're thinking about how best to write this method, what this class is going to look like, how can it encapsulate, how can we make it make our code easier to read, easier to change, for example. And that's that's great. But the reality is often actually we spend a lot of our time actually trying to understand what before we can actually actually be productive. And obviously that might be really basic stuff like trying to understand something about the language or tech stack you're involved in. I mean, I can remember kind of fairly early on in my career, a real challenge, you know, one of probably my most stressful time I had was actually working on an options trading system. It wasn't core part of our business, but every now and then I would be shipped off to a basement in Zurich or Frankfurt, and I'd have to sit there on a Sunday trying to fix this trading system that had suffered some sort of failure. And I had to get it working by Monday. And the code itself was relatively complex, but actually the changes I had to make were very, very simple. But I could only make those very simple changes once I had a full understanding in terms of what was causing this problem. So I spent my time trying to sort of understand that code base so I could actually make a small change to actually get the system up and running. And that's essentially the reality for many of us. We live in an age of software. Essentially, everything is driven by software in some way. Therefore, there's a huge demand to actually iterate and improve our software solutions, which leads to basically there's a lot more code out there than there used to be. We have many more developers and there's additional complexity. And in terms of if you look at the studies from people like Microsoft, this one's actually from Stack Overflow. In terms of from a development perspective, people spend 75 percent of time trying to understand their code rather than actually writing the code. So, again, really what we ideally want to do is actually sort of kind of change those metrics so we can actually get up and actually write code more effectively. So how can we help people understand their code more easily? Well, this is essentially from a source code perspective, kind of essentially our goal. How can we help people get context, get understanding more quickly so they can spend more time actually writing code? So how do we do that? Well, essentially what we do is we have a universal search platform. And essentially what that allows our customers to do, our developers who are using this platform, is it allows them to actually search across all of their code. So all of their repos, kind of regardless actually from what code host in terms of they're using for source control, but their front end repos, their back end repos, the different services that comprise their application. So they can actually get a better understanding of, you know, help them answer questions essentially. So, you know, how is this service implemented? How is someone using, who else is using this particular component? Who else is using this function or API call back to the back end? How do I use it? You know, helping people actually, you know, get that sort of more complete understanding in terms of so they can actually sort of carry out their development, actually deliver actual application code. So I'm just going to dive into give some really simple examples of how we might use this platform. Now, I'm not a React developer, but if I were, and so for me, what I can do with Sourcegraph, I can actually do some very simple queries. This system here we're looking at actually has about 2 million GitHub repos indexed on it. So 2 million public GitHub and GitLab repos. I know that Formic is a React library. So I can actually just do a very, very simple query to find out where that occurs within my code base. I can actually maybe do a slightly more complex query to actually see, actually a real query in terms of see where it's imported. So which code is actually important that that library within my code. In this case, we're doing a very simple regular expression query to actually show where in that code that occurs. If you want to dig down and actually see, look about, look at specific React functions, again, very simple queries. But again, immediately give me information about my code base in terms of where I might want to go to actually find information about how someone's actually using that function. Of course, from a React perspective, this is very basic. But of course, at that point, it allows me to actually then dig into the code. I can actually now start to look at the other sort of symbols that are defined within this code. So essentially now we're going from not just pure search, but also actually more towards sort of your actual capability you'd expect within your IDE in terms of code navigation. The key difference here is that we're actually doing providing sort of simple navigation across all of your code, not just the repos that you have on your workstation, your laptop that you're using within your IDE, but code that is perhaps lives outside of your teams or code that you don't need to touch very often. So, again, it's allowing you access to that code both in terms of search, but also in terms of code navigation. And in terms of React, we actually use React within Sourcegraph in terms of our UI. Actually, one of the things we did about 8, 12 months ago now was actually change actually how we use React. So we were actually using class components rather than function components. So, again, in this case, what I can actually do is search within, again, this is across all of these sort of 2 million repos, search across where we're using function components. But what we do with Sourcegraph is not just provide the search, not just provide the code navigation, but also allow us to use search to power other capabilities. So in this particular example, we were migrating from one code pattern to another. So what we can also use search to do is actually show us how we're actually tracking, how we're actually monitoring that particular migration. So a code pattern migration in this case. But it may be in terms of how we're actually adhering to best practice in terms of ensuring our code has relevant metadata, or how we're migrating from one library version to another across our code base. So what we can do, because we have all the code, all the commit history that we can search and index, we can actually start to actually create time series views on that code base. So in this particular case, we're showing how the migration from our UI in terms of from a class component to a function component. So essentially, what this is giving us is basically a view in terms of our changing code base. It allows us, if we're sort of responsible for that particular initiative, allows us to actually perhaps make decisions. Is this going as fast as we want? If it's not, why is that? What actions can we take to actually change or impact the rate of change for this particular initiative? And finally, one of the other things we can use search for is actually to help us actually automate changes across our code base, across many, many repos. So in this case, what we're doing is we have a script, essentially, and we're essentially using that script to find out which repos we need to change and then defining a set of steps to actually make those changes across all those code bases. So automating change across multiple repos and then allowing us to actually monitor how those changes are, how those changes are progressing through the relevant CI pipeline and review process. So the goal really is to allow, from a developer perspective, allow us to use a tool to help us actually allow us to focus on higher value work and perhaps more basic kind of code refactors that otherwise may not happen or may require us to actually liaise with many different teams in order to get them to sort of actually make those changes and then accept and review those changes. So kind of a quick tour of the Sourcegraph platform. So I'd just like to sort of thank you for your time. So, again, from a Sourcegraph perspective, our goal is to help improve that efficiency of that inner loop. How can we help developers understand their code, get context, essentially get into flow, enable their colleagues to answer their own questions rather than coming to sort of interrupt you to ask questions about the code base. I'd love for you to actually go to Sourcegraph.com to actually use universal search across two million plus repos. That includes many, many React repos or repos, all the Facebook repos, for example, that are public. Go and try it out. See what impact that has in terms of your own development kind of lifecycle. And I'd just like to say thank you very much for your time.
19 min
24 Oct, 2022

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Workshops on related topic