Trailer for the talk
Here's a sneak peek of the talk: This talk will help you get familiar with the essential tools of monorepo management.
About the talk
What is a monorepo? Monorepo is a practice that helps us organize multiple projects into a single git repo. Monorepos can help teams save time on syncing and spend more time on development. Not only so, monorepos, as a single source of truth of every dependency, can increase visibility and consistency. Here are a few time syncing tasks that monorepos can resolve:
- Pulling and pushing from git repositories across many folders on your PC just to run the project
- Bumping versions on different npm packages
- Figuring out where to run the "yarn link" command and when to build each package
This talk will cover
- The basic terms and project evolution: monolith, polyrepo, & monorepo
- Basic concepts related to monorepos: npm and yarn links, workspaces, Lerna, and NX
- How open-source projects use Lerna to manage their monorepos and simplify workflow
- How enterprises use NX to increase build times and simplify their internal workflows
About the speaker
Adam has been developing web products since 2014. These years helped him gather experience both with technology and with people. Today, with his company appk, Adam helps companies hire, train, retain, and scale teams of junior developers.
Transcript
Adam: So thanks everyone for coming. Thanks Codementor and Codementor team for hosting the event. Yeah, we see in the from of the quiz that the biggest majority of people here are working with JavaScript and TypeScript and. And the talk is focused mostly around the JavaScript Typescript ecosystem. Then again, the principles that I will be discussing are relevant across the board.
Adam: And I will even mention that for the biggest framework for using outside of the JavaScript ecosystem. So let me share back the screen.
Adam: Let us start.
Adam: Monorepos in the JS world. I actually think that it's not describing that as I wanted so I renamed it just before the lunch to Monorepos for JS developers.
Adam: Yeah so bit about me. So I've been working for quite a few years, several years on, on different products with different startups. And recently I've changed my focus the few years back of changing my focus to be working, as a consultant with many companies, helping them build and grow their teams effectively. And with a reduced rate.
Adam: So another couple of words about Codementor, because for me, he truly like the platform really helped save a lot of time. I am an expert in what I do and I'm expert in JavaScript and in the JavaScript ecosystem, then. And then let's say AWS, I know almost nothing. There was a time, a few times where I needed some help in AWS. And when I needed help in AWS, it means that I am about to spend a week for, I don't know, maybe a few days of work, learning some specific feature of the platform of networking. So yeah, I just went to Codementor and in a few hours, everything will the results. So thanks for the platform and thanks for the, for this event and the stage.
Adam: So what will we cover today? Section one, basic concepts, then like basic concepts about one repo and how to discuss it. What are the basic terminology that is related to managing them on repo, JavaScript monorepos ecosystem. Which is the, I assume yeah, the JavaScript monorepos ecosystem, which will be the second section. And the third section is an in-depth deep dive into nx and how nx manages deep our relationship inside our projects.
Adam: So basic concepts here it is. We have a few words that we should be familiar with in order to have this conversation about how to manage, what is the monorepos, how to manage it effectively. And the terms are project or package repository monolith, polyrepo, monorepo. So we discussed, discovered the project package a repository is just to get repository. A monolith is a servery usually, or a UI, but usually a server that has a huge code base that has a huge code base. It has a few advantages that is it's very easy to set up. And, it is a very natural workflow to grow the project into being a monolith so we don't need too much expertise or anything. It is very easy to debug things. It's a single code base. We can lace the debugger, the debugger on the top of the first line of the entry file and the we can just step our way through the entire application.
Adam: So we're actually as a development worldwide community, we're working about how to create a similar experience for microservices. Maybe there are some services you guys know, so. You can, you can write something about it in the chat, if you know a service that helps debugging microservices effectively.
Adam: And another advantage of the monolith is that it is great for small applications. Then again, it is not exactly a monolith, but let's say if you have some small internal tool and you're using next for it, next JS, it's great. Right? You don't need to run many different stuff. You don't need to configure anything. You don't need to know anything you in solid chronic. Yeah, this is it.
Adam: Now the monolith disadvantage is since it’s huge, we're going to be building and testing this entire code base every time, something, every single time, something tiny changes. So even if I change, I don't know, some basic UI features or small feature in the super admin, the super admin screen that does not affect no one. I will still rebuild the entire code base. It will also be harder for me to test specific features. It will be possible but harder, and I will deploy the entire code base and run the entire code as it is.
Adam: So the disadvantages with writing the entire code base is that when we have one, when we're scaling, when we have many instances of a single huge code base, for some reason it spins up many additional instances because of some additional load, it will be hard to profile and know exactly where the load, where the bottleneck is. Where if we're having microservices and we see that a single small service suddenly spins up many instances. Well, it'd be much easier to identify. So these are the disadvantages and advantages of a monolith. And basically it's comparison to microservices in a very shallow way. Obviously it's a huge topic. There's a lot of constants out there about the two, but yeah, this is not the topic of the conversation.
Adam: A polyrepo means that we are like, we had a monolith. We wanted to split it two different projects. And the most intuitive way to do it is to have a polyrepo, but like at least it was the most intuitive way I assume. So, polyrepo essentially means that we manage a good repo for each project. So for the repo also have a few advantages. So it's a very intuitive approach, right? We have the project repository, right? Single one-to-one relationship, and there is no tooling or expertise needed to set it up. Each repository has its own CI/CD. Again, it has a lot of advantages with comparison to a monolith because we can test and build and deploy things independently. And it has some disadvantages. So it's very hard to maintain. If you have many repositories that you run, you usually also have many ideas that as a developer, you open simultaneously and you somehow, you know jungle them altogether and you will have a slower onboarding. Yeah, it has quite a few dependencies. So as you scale will be harder to manage this. The more git repositories you have the harder to manage.
Adam: And so the answer is the monorepo. The monorepo is basically just a repository that has many projects and packages. So these are the basic concepts and let's go to the next section, which is the js monorepo ecosystem. And in order to understand the ecosystem, we need to understand the story behind the ecosystem. How do we develop? So I basically outlined the technologies, but let's, let's like, let's look at it in a, again in the way that the project or the vehicle system evolves with time.
Adam: So first we have a monolith, we somehow split it to many different. We have a monolith, for some reason we don't have have a monolith but we understood for some reason, having many services is good for our case and obviously they're interdependent. So in some cases it's very easy. We just strung different, services in our application and these services has different ports and they make it to request one another. But on other cases they will be dependent. So we will have one project that is a dependency. And then at the end dependency of another project, in this case, we will link the repositories together with npm link.
Adam: The next stage after we have many repositories we'll discuss it. We'll discuss it in because to be shawl we managing with learner and the third one will have its own section. So yeah, let's, let's dive into it. Linking packages and missing npm link and yarn link is just sim linking that is managed same links that are managed by npm or, it's nothing too special, but it's very useful. It's very useful, especially when you develop an external library locally. So let's say you want to start playing with react and you want to use, you want to develop very active library or not, not using it, but developing it. So what it will do is you will get clone the react repository, and you will build it and you will make an npm link or letter or a yarn link between your package between the react that you just cloned and your project where you are using react.
Adam: So it is a very cool feature for stuff like this. If you have a small polyrepo, like two to three polyrepo that are interrelated and they need to be linked. It will be very easy to set up and user, but the more projects, the more repositories you have, it just doesn't scale. It will be very hard to maintain. It will have many versioning issues, many build issues that will be need to we'll need to manage. And this is where many teams start using Lerna. So an interesting thing about Lerna is that it's not active maintained. Guys, one thing I would like to ask is that all questions please type in the Q & A section so that we will review them in the end of the talk. We won’t lost in the chat.
Adam: So, an interesting thing about Lerna is that even though it's widely used, it has over a million downloads per week. It's not maintained, so yeah, that's the state of Lerna, it’s widely used and it's not maintained, just a disclaimer. And we were at the stage where we have several repositories and it's daunting to manage them. What we're gonna do is we're gonna set up a Lerna project and we will use Lerna import to all of the external projects into our new monorepo. So we will basically use the Lerna import command in order to convert the polyrepo to a monorepo and learn how the way to think about the way to look at Lerna is like this.
Adam: So you can see my screen and see my browser. Can you, can you type it in the chat if you consume it browser just I won’t lose you guys again. Thank you. Thanks. Thanks everyone. Thanks. Okay, so we have the create react app open here, the, just the github repo and their Facebook app. And this is a Lerna managed project. Lerna as the convention has a packages folder, and it just manages all of the packages inside the fact that it's folder.
Adam: What do I mean by manage? I mean, you can run a Lerna command, let’s say Lerna run and Lerna run start and we'll run start for all of the projects. if we. Once to install all of the dependencies for, for all of the projects, we can run Lerna bootstrap. Lerna bootstrap can also interlink them. Yeah. So let's say let's look at, react scripts and let's say the react script has many dependencies and one of its dependencies is the plugging, the bubble plugin name, asset import.
Adam: So Lerna bootstrap will install all of React script dependencies with the exclusion of bubble plugin named asset import. And for this one, it will just make assembling, using the linking technique that we talked about before the npm name willjust link it. So Lerna bootstrap is a very powerful command and generally Lerna is a very powerful tool that has a lot of different utilities to manage, many packages effectively. So yeah, that's Lerna packages.
Adam: The way to really think about them. I think that they have this, this amusing logo that is really describing what Lerna is, a single entity that has many heads and each hand does its own thing. And they're all controlled from a, from a central place. So yeah. So after you, after we manage projects with Lerna, we also reach to a certain bottleneck with managing that project.
Adam: And that's where nx comes to the picture. So the next product, the nx part receives its own section. So we we've discussed the monorepo journey, and now we will discuss nx, deepening the projects and libs with nx.
Adam: So in order to better understand what is Lerna and what is nx, let's I I've drawn this small, graph illustration. So Lerna is a single head. Right. And it manages many packages. On the other hand, nx has a graph that is representing all of the different applications and libs. So in Lerna, what we see here in there now would be package, package, package, package, package, but within nx, we have the relationship of each package in a very deep and nested way with want only same link, it will make many smart choices and actions with regards to this dependency. Right? So, I haven't invented the graph actually, nx has its own graph and we will soon look at some, this is the graph that nx generates, and we will assume look at it in more detail with a small demo.
Adam: So, briefly. Briefly speaking about nx, it builds and tests projects only based on what you've changed. Lerna also does it in a way, but they have some differences that we will soon discuss. When changing a library, this is one of the biggest library when changing a library, it will also build all of the dependents of this library. So if I haven't a UI app that is dependent. That has a UI lib in its dependencies and we update the UI lib. We will have to rebuild the UI app. So this, this type of workflows, and nx those out of the books and if managers for us, and the last very interesting thing that nx does is that it caches the build results.
Adam: So if we've built. If I build the UI lib, and I'm now working on UI app, I won't have to rebuild the UI lib since it was already built, it will use the cache. Okay. So let's look at a small demo. This is a gentleman out of session that I recorded for, for the stock. So. Yeah, basically here, we're just creating an nx project with create nx workspace similar to creare a creat app but with nx.
Adam: So after running the command, we have a few choices to select from. We can either create in an empty workspace that has some pre-configured setups for four different types of workflows. We can create a single application. And we also can create a full-stack application. Like let's say, react express.
Adam: This is just the boilerplate. And it's very nice, but actually nx has a CLI for generating all types of libraries and, and applications. So even if we choose an empty template, we will be able to easily fill it up with some boilerplate. But for this case, I've used the react express that just so we'll have something to look at it and play around with.
Adam: So this demo is not real time. I reduced the installation time, so we will be able to skip the boring stuff. And yeah, basically after we make our choices it's installed and let's go through the code. So after we created it, Yeah, I can just go back to the slide. So I won't miss anything. Yes. After we install nx, we can type nx graph and it will show us our dependency graph.
Adam: So it says here we need to select something from the sidebar in order to see what's going on. So I'm just going to click show all projects and here we'll see all the projects that nx has created. Well, we've been create an exam. And if you will look at the folders, we'll see that we have an apps folder and a libs folder. And this is a big part of nx philosophy or principles, or I think they call it a mental model. And, yeah. So here we see what depends on what and the arrow points like each project points to its dependencies. So the monome-e2e dependency is monome in the monome dependency, the single dependency is API interfaces.
Adam: Theoretically speaking, if we will add additional, if we will add additional interfaces, libraries, utilities, and monome will be dependent on them. So it will have many errors coming up. But since this is just the boilerplate project, this it's a very simple graph. As you grow, the grass will grow more and more complex.
Adam: So let's discuss how things will be built differently here with different use cases. Let’s say i’m editing the api interfaces, nx will then see all is dependent on API interfaces. It sees that both the API monome is dependent on them, so it will build the API and the monome and then it will check who is dependent on each and every one of them.
Adam: So let's say the API has no dependencies. It will finish its work better. And it will rebuild the e2e recurisively. If on the other hand, I have not exchanged the interfaces for awhile and I only changed the monome. It will then only rebuild the monome and the monome-e2e, and if I will only change the monome, monome will only change the monome.
Adam: So again, this is a very small project, but imagine having, I don't know, 4 to 5 applications, which is I think, a reasonable amount for even a small company and having the denser librairies. It is going to reduce quite a lot of build times. So there's a very, very powerful too.
Adam: And the next thing I'd like to show is that. This is the nx caching mechanism. So nx has a build cache. Build cache basically says that once I'm running something, once I'm building something, it catches it. So let's, let's see how it works. This session is shown in real time, so we’ll be praising it, but then means the buildl time is real. Nx build monome, it will see that it has no cash, so it will, it will generate the entire process. I'm going to pause it for a minute and we can see that it was built for a seven. It took seven seconds to build. Also, nx has a dashboard where I can see some additional details about my buildl and I can share it with my friends, teammates in case they need it.
Adam: So, yeah, that's just. Output I had, I can see also here is just the output of the head. When I ran it, I can see that if he had no cache, right. Because they cache missed and they can see that it took six seconds. Funny. It's a seven. So yeah, I will now run nx build monome again and let's see what else. This time, it took 64 milliseconds. It's hard to notice how fast it was. Cause we just, the second I typed it, it recreated it. So here we see that the build hit the local cache, we will see the same output as before, but we'll see that the build has a local cache. I will run it again without doing anything special. It will again hit the local cache. So nothing interesting.
Adam: But this time. And we'll run it again, but before running it again, I will remove the nx cache folder. So nx stores the cash locally inside the nodemodules.cash folder. When I move it, we will have no local cache to use from. So there's, so let's see how long it takes and let's discuss what happens. Okay. And there we go again, that's seven seconds, but not 64 milliseconds. Right. This time it took two seconds and let's see inside the dashboard, what was going on. So this time we have a cache from the cloud, so let's review again, the different symbols that it had. We can have either cache and I have a local cache or a cloud cache.
Adam: And, and yeah, so, and this is why like two seconds, which is not 64 milliseconds, but it is not seven seconds and it scales. So if you're build takes 10 minutes. And you have a cloud cache, it will feel take two seconds, you know, maybe a bit more because you have more assets to download, but it will still take several seconds just to download the assets.
Adam: I ran it one time and it downloaded from the cloud. It will again, take on 66 milliseconds since it started locally. Again. Now another great thing about this, about the cloud cache is that after I ran the build script locally and it cache to the cloud and nx takes my package json, it takes the entire file system that I have, it takes the node version and a few other parameters and it hashes to them to a single shot. And every time that another developer in another machine runs build with the same exact settings, it will download it and make, and it will use it.
Adam: So. Accumulatively we're speaking about a lot of time and resources saved. So this is a very cool and useful feature, especially again, for large project.
Adam: And let's move on. Nx has a few other features. So we only touched on its dashboard. It has a few other cool features inside of it. It has distributed workers. So we have this complex screen we're building. This depends on this, depends on that. And. If you run everything, everything, sequentially, it will take so long. It will take a lot of time. We parallelize it on a single machine. We will be limited by the session's resources. So nx allows us to spin up many machines and distribute the work between them.
Adam: So if we have a build that say, the entire build takes 10 to 20 minutes. Like let's say 20 minutes, have been spinning up. A few workers can reduce the time probably by half, probably even more, depending on the one would be dependency and dependency graph you have on your project settings. So yeah, distributed to the workers again, a huge resource stabler.
Adam: Nx has standard scripts for generating applications and libraries and for serving and building these applications and libraries. So it's very useful. Once again, once you start using it inside your organization, or even internally, like. Even for a products that was on my own. I find it very useful that you have a very similar script around the same things. And yeah. If anyone here has worked with Angular or with another platform that has generators, it's awesome. It's really awesome. So having generators is a big thumbs up for me.
Adam: So, yeah, this is the outline. This is basically the topic and we're entering in Q&A in a few minutes. If you have, if anyone has questions to give and yet feel free to do so.
Adam: So what we've discussed so far is a lifecycle of a project and the lifecycle of the javascript ecosystem evolving around complex projects. And we also discussed how nx helps and in which stage during our journey. Also discussed what exactly nx does in order to help us and what features brings to the workload.
Adam: If you'd like some additional resources or reading the session side recorded, or the best sessions are asciinema. I think this is the way to pronounce it. Tilda and Jen, the nx website, use the nx dev and actually a bit of history about nx and yeah, we'll we, we will run out of time just after this. So nx was built by a couple of developers that were working on Google. In Google, they saw the concept of bazel, I think inside Google it's called blaze, which is basically the mental model under which nx was built. Well, the nx is a project that's aimed for JavaScript, bazel is a project that’s aimed for many lower level languages. So Go, C-sharp, Java, rut users will probably want to bazel. And just another word about bazel, so Google has a huge monorepo. Google is the mother of all monorepos. They have like 80 something terabytes, maybe 90 by now terabytes of code, inside the code base. So you can imagine how complex it is when some change happens in one place to realize what needs to be built as a chain reaction. So based on managers for other frameworks and languages
Adam: Q&A. So Marcy asks, if we use the npm link, this will not add the package to package.json. How to deal with it in production deploy? Yes. So if we're using npm link, it's a correct answer because if we're using npm link, all npm does is that it takes one package and created assuming two, another package and.
There are a few different things that we need to discuss in other plants are this. First of all, is the link package only local or does it live inside npm? So if it lives on npm, we will store it. The, like, let's say if we're using react, which is a simpler example, let's say I'm developing my own app that uses react, but I'm also developing react, as a side project.
Adam: So I will have react inside my node modules on the application I'm developing and. I have my, my inside the package.json on the application I'm working on the reactor as a dependency with its version and everything. And when I'm linking, I'm looking into a local thing. So when I'm deploying it inside the CA script or whatever, I will rather install it. Right. So that's one use case. If the application is actually live somewhere on site npm on another use case.
Adam: For another use caseI will have one project on another, a use case that will have one project that is using another project of mine that is not published on npm. This is a bit more tricky and there are a few ways to work around it, but really a next solvers. In a very nice and elegant way. So that's one approach. If we'd like to use not use neither Lerna or npm, we can bundle it using web pack or something like this. We can, yeah, we can do quite a few other things.
Adam: Let me know if you'd like me to discuss other options here, since it's a very broad topic and we can dive into many different branches. So Kiryl asks, how does it manage versioning? How does it meaning nx or it meaning? Yeah. So nx has a single state of, I'll get it. So, and nx says a single state of the entire monorepo at a single time. So as opposed to having the previous example where we have, let's say they're not also then by the way, one, monorepo I'd say has a single state desirably. And so when I'm let's start from the polyrepo. I am having my own project, my own application, and I have react. So every time react will be upgraded, I will upgrade react manually. Right. So this is the painful thing. If I have many dependencies, it will be a nightmare. So the way Lerna will do is I think it has a bump script. I'm not sure. I think that it has a bunch of scripts. When I upgrade one version of the package, it will update all of its dependencies.
Adam: I'm not sure nx does it in a completely different way. Nx manages the dependencies without regards to npm. So it's not aimed to publish things. It’s aimed to manage things entirely from the code base and build things from the code base. I think that you pretty much do not need to think about versions unless you publish public packages. Pretty much not need to think about versions when working with nx. Let me know if you have any follow-up questions.
Adam: So, what about turborepo? Any fact on that? I actually haven't played with turborepo, I heard that it has some advantages, like uses Go and stuff like this. And it's theoretical that Go is significantly faster than JavaScript. The thing about this is that really what takes the most time when you're building a huge, JavaScript. Is building the JavaScript. So if you would like to reduce the time that the, if you'd like to reduce the time by using a lower level language using as SWC or what was the name of it? He has built something like this will be the effective approach because you want to compile a lot of it to the JavaScript, the compilation is the expensive part and really managing the, managing the monorepo itself isn’t as expensive. So with nx you also have the caching mechanism, which is very powerful. Yeah. So it's, it's hard, hard to illustrate that rather than, than I have, but it's, it's a very powerful, large basis from what I know on the repo and it doesn't have a caching mechanism, but I, again, I have to say that I really read about it.
Adam: Hey guys. So, thank you. Thanks everyone for coming. Thanks again Codementor for hosting the event and for running the platform, again very useful. if you guys would like to contact me, I am available in LinkedIn as adamgen, Twitter, AdamGenShaft and my github handle is adamgen. Thanks again, and I will see you around.
Adam: Thanks. Thank you guys. Thanks for one of the things here.
Adam: Let me know if you'd like me to cover anything else or discuss anything. Just let me know. I will be here for the next five to 10 minute. Thank you. Thanks for being here.
Okay guys. So if anybody has anything you'd like to discuss, to talk, to bring up joining me on LinkedIn, send a message and we'll be in touch. Bye. Cheers.
Highlights of the talk
What are some basic concepts to know before getting started with monorepos?
The basic concepts to understand include project/package, repository, monolith vs. microservices, polyrepo, and monorepo. Projects and repositories are self explanatory and require no further explanation.
A monolith is a server or UI app that uses a single huge codebase. The advantages include it’s easy to set up, to debug, and great for small applications. The disadvantage is that every time you add something small, you’ll have to test, build, deploy, and run a huge codebase. It’s harder to test specific features, it’s expensive to build unchanged features, and deploy and run a single entity. The disadvantage with running the entire code base is when you’re scaling, when you have many instances with a single huge codebase and spin up additional instances, it will be hard to profile and know exactly where the bottlenecks are. However, if you have microservices, and you see that a service suddenly spins up many instances, it’ll be much easier to identify.
A polyrepo is when you split a monolith into different projects. The most intuitive way is to create and manage a git repo for each project. The advantages of polyrepo include it’s a very intuitive approach, there is no tooling or expertise required to set up, and each repository has its own CI/CD. The disadvantages mostly revolve around the difficulty or maintenance. It’s hard to manage internal and external dependencies, the onboarding is slow, and the dependencies are high.
What is a monorepo and the JS monorepo ecosystem?
A monorepo is a repository that includes many projects and packages. The classic monorepo journey involves linking the poly repo repositories with an npm link, merging to a shallow monorepo with lerna, and deepening the projects with libraries with nx.
Linking packages are just symlinks that are managed by npm or yarn. It’s very useful because you’re able to link external libraries that you work on locally and easily connect dependent repositories on a polyrepo. The con is that it doesn’t scale for running many projects. To tackle the challenge, you can use Lerna import, a widely used toolset that is no longer maintained, to convert polyrepos into monorepo. The way to think about lerna is that it’s a single entity with different hands and many different capabilities.