Monoliths, Microservices and Multitenancy
This entry discusses some of the high-level concepts that are relevant to modern software architecture at a general level, namely monoliths vs. microservices, and multitenancy. I'll give some guidance as to what those mean, and how they are applied to this foundational concepts blog series and the accompanying demo application.
In this and future blog entries I'll assume that you fall into a certain audience, namely individual entrepreneurs, small software startups, or developers/teams building new solutions within a larger organization. This information is still valuable if you are working on legacy systems or doing what we call brownfield development. However, the spirit of the information presented is most applicable to building new solutions from scratch, what is referred to as greenfield development.
Likewise, there are myriad kinds of software applications and corresponding types of developers. Someone developing embedded software for hardware devices is going to have a different approach than another developer building line of business applications. My focus going forward is going to be on full-stack web application development, with an emphasis on solutions that are hosted in the Microsoft Azure cloud.
Who Is the Customer? What Is Your Business Model?
For sake of discussion, let's say that you are trying to serve a certain customer archetype, namely:
- Small, startupy type companies occupying a well-defined niche (B2SB—business to small business) OR
- Individual consumers using a subscription/freemium model, or free with affiliate marketing, etc. (B2C)
With these groups as your target, you will be building your applications using a common, tried-and-true business model: software as a service (SaaS). This simply means that you build and host your own applications, as opposed to delivering the software directly to the customers. Anyone who wishes to use your applications gains access through any number of subscription levels. Under the SaaS model, you own and maintain the production software systems at all times and generate revenue by offering use of those systems to customers as a service.
What Is the High-Level Approach?
In a word, you will build monolithic web applications which have been flexibly architected, so that in the future, as your business grows, you may scale them up and/or scale them out—i.e. break them apart into microservices. This is a common direction for software startups to take, and it's what I recommend, starting out.
Focus Is on the Cloud
One of the great benefits of the cloud is that you don't have to pay a great deal of money for equipment up front, rack space in a data center, and so on—you only pay for what you need. You've converted your capital expenditures (CAPEX) into operating expenses (OPEX), which is huge for software startups, especially if you're bootstrapping on your own dime (there's a good article about the difference between the two here).
There is a plethora of other benefits that come with using the cloud, but some of the obvious ones are:
- Better security. If using platform as a service (PaaS) then the underlying infrastructure is provided for you to build your apps on, and you don't need to worry about the maintenance nightmare of keeping that infrastructure up to date so that you don't get hacked. You can focus on building your apps.
- It's easy to provision high-level system resources (databases, BLOB storage, etc.).
- It's easier to scale. A good example would be using elastic database pools to scale out horizontally.
- You only pay for what you use. You're not wasting dollars and kWh running a server in a data center that runs at far less than full capacity most of the time. It’s better for the environment too.
- Finally, if you architect your solution well, it should be easy to containerize down the road. I'm not going to get into that right away, however. Starting out, I'm going to build and deploy the demo application using PaaS as an Azure App Service, and we'll take it from there.
Monolith to Microservice
What Is a Monolith?
The term monolith typically refers to a software system, subsystem, or component that is large, difficult or impossible to break apart into smaller pieces, and opaque. For this reason, "monolith" has often come laden with a lot of baggage and negative connotations. Indeed, if you've spent any amount of time working on legacy systems that are more than five years old, you'll find that this is often the case. However, it doesn't have to be, and it shouldn't be, especially in this third decade of the 21st century, in which the cloud is front and center of all our design decisions. So, I'm going to make a bold declaration, which is that:
- If your architecture is built using solid (and SOLID) architectural principles AND
- You utilize good design practices, such as domain-driven design, AND
- You think compositionally, with a mind toward horizontal scalability and a move toward micro-services down the road, THEN
you will have a solution that is flexible and robust and scalable enough that you can decompose it into smaller pieces should the need arise. I will explain just how to do this later, when I introduce my own architectural template for a Clean Domain-Driven Design application.
What Is a Microservice?
A microservice is a single application or subsystem, with a well-defined and cohesive purpose, which does not constitute a full software system in and of itself, but rather plays a supporting role alongside larger systems. Microservices expose an API and are largely communicated with using a network protocol like REST or some message bus. They typically live in the cloud, are deployed into their own containers, and have their own database/persistence store which is separate from the rest of the application(s) that communicate with them. Microservices have been all the rage for several years because they embody the Single Responsibility principle on an architectural level, and they allow parts of your system to scale independently of other subsystems. For example (just off the top of my head), the developers of a financial application may wish to put all the CPU-intensive modeling code like running Monte Carlo simulations, etc. into its own microservice, either in the cloud or on some beefy machine in a data center somewhere. Or perhaps within a large organization, for reasons that are technical (political), the Accounting department needs to have its own database with corresponding data model which is separate from all other systems. This database should ideally live behind its own microservice, as opposed to being directly modified by the HR application, and so on. It may seem like more work to implement your grand architecture this way, but it isn't. My humble opinion is that allowing applications to communicate with each other via cross-database joins is an architectural anti-pattern, and it ultimately leads down the path of pain.
Ensuring a Smooth Transition
Starting out with a monolithic solution does not necessarily imply that everything must go into a single project. We will be breaking it up into separate projects using the principles elaborated on in the next couple of blog entries. Especially if you have applied domain-driven design correctly and divided your solution into bounded contexts with a logical separation that makes sense, then this should not be a problem. Once again, this is under the assumption that you are using sound principles and practices.
What Is Tenancy?
Regarding software architecture, tenancy refers to the logical, and sometimes physical separation of data belonging to different parties. In general, "tenant" refers to an individual customer of your SaaS application. However, it could refer to different departments within an organization, or other groups of users. The key concept to grasp is that these groups must have complete data isolation, and absolutely no access to the data which belongs to any other group. When a software solution has been correctly architected to handle multiple tenants while robustly enforcing data isolation, then we may call this a true multitenant application.
The vast majority of the time, I suggest you stick to the orthodox nomenclature and don't try to invent new lingo to describe these concepts, with one notable exception: if you are building an application in which the term "tenant" belongs to the business domain, for example a real estate management system, then you should refrain from using the term "tenant" in the technical sense and instead use "customer" and "customer ID". Otherwise, it is appropriate to refer to a logically isolated set of customer data as a "tenant," which is identified by a tenant ID.
Why Tenancy Is an Important Consideration
Understanding what tenancy is and how to build multitenant solutions is exceptionally important for any SaaS business. Considering worst-case scenarios first, if you build your solution wrong and leak private customer data to other customers you can:
- Get sued.
- Take a devastating hit to your company's reputation.
- Lose customers and revenue.
- <Insert additional catastrophic scenario here>
You get the point. On a positive note, if you pick the right tenancy model then you can ensure that your solution will scale well over time as more customers are added and maximize the use of your cloud provider so that costs stay low.
For these and additional reasons, I will adamantly state that for any modern SaaS applications, tenancy is a primary consideration, not an afterthought. I strongly recommend that you pick your tenancy model first rather than trying to shoe-horn in some kind multitenant solution later on down the road. Trust me on this one.
Multitenancy Solution Templates
Here I've listed out a few high-level templates for multitenant solutions and the implications of each. I don't claim this list to be complete, but it should give you a good idea of some of the possibilities.
Template A: Single Application, Single Tenant Database
The architecture has no consideration for tenancy built into it. It is a single application, deployed to a single server or cloud application service with a hard-coded database connection string to a single database. At a very good lecture I attended at 2019's Chicago Code Camp the speaker, Jonathan Tower, described this as "not a good pattern," which is probably the understatement of the year. I mostly agree with him on this. However, as stated in the previous blog post, there are exceptions to every rule. Under uncommon circumstances, you might be able to get away with this. Two situations I can think of are:
- If you represent an internal IT department building a specialized application for another department in your organization, and it is highly unlikely that any other groups of people will need to use the application, then the single app/single DB model may be just fine.
- If you are a dev-ops pro and really, really know what you are doing, then you might be able to transition such a solution to Template (E), described below. Added emphasis to you really know what you are doing.
Template B: Single Application, Single Multitenant Database
There's a single deployment of the application which is deployed to a single server or cloud application service. There is a single connection string to a database which is logically partitioned to handle multiple tenants. This is accomplished by adding a tenant ID column to the clustered primary index of every customer-specific table. Conceptually, this causes your database to be divided into sets of logical slices, each one pertaining to a separate tenant.
Tables which are shared, or otherwise non-customer-specific do not need a tenant ID. An example of this would be a table which holds a list of currencies, or a table which lists all the different states and zip codes in the United States.
This is a fine solution, but there are a couple of downsides to it that you need to be aware of:
- There may be paging issues and other low-level database considerations which can impact performance. These are outside the scope of this blog entry, but if you really want to go down that rabbit hole check out this stack exchange question.
- Demands of individual customers, such as wanting to stay on a previous version of the code, may force you to implement certain hacks or other patterns such as feature flags or a tenant-by-tenant versioning system. This can become unmanageable if you wish to keep your database schema the same across all tenants. Conversely, you can simply draw a line in the sand and make it clear to your customers that upgrades to the system are compulsory and that they will always be on the latest version, whether they like it or not.
Template C: Single Application, Multiple Databases (Database-Per-Tenant)
There's a single deployment of the application which is deployed to a single server or cloud application service. Every tenant has their own database and thus their own version of the database schema. You maintain a separate "catalog" database or other API which acts as a lookup to determine which database belongs to which tenant.
This, or some variant thereof, is the most common multitenancy pattern that I've encountered in my professional career. There are both advantages and disadvantages, but here are the upsides:
- The most obvious advantage to this approach is that you have hard isolation between tenants. Unless you go out of your way to shoot yourself in the foot (like using cross-database joins, etc.) there is a zero percent possibility that customer-specific data will spill over from one tenant into another.
- Another advantage is that you can utilize cloud resources, such as elastic database pools, to scale appropriately to tenants who use the application more. Likewise, this will help you get the most bang for your buck, cost-wise.
- Finally, having separate databases allows you to more easily maintain separate versions of your application for different tenants. Be careful, this is a double-edged sword. I've worked for companies where client-specific customizations were baked into each database and after a while there was no single consensus schema. This is an easy trap for startups to fall into, as you are constantly bending over backward to make the customer happy. The obvious solution is to try to find a way to fit customer requests into the core schema and build that into the upgrade path for your application. However, that's easier said than done.
Template D: Single Application, Multiple Sharded Multitenant Databases
This is a hybrid approach between templates (B) and (C). There's a single deployment of the application which is deployed to a single server or cloud application service. You maintain a separate "catalog" database or other API which acts as a lookup to determine which database belongs to which tenant. Some tenants may be in the same database as other tenants; however you can reserve an entire database for certain large tenants.
The pros/cons of templates (B) and (C) still apply, except that here you have a greater degree of flexibility which can help mitigate some of the downside. This is also a cost-effective solution for small, startup type companies that want to maintain the capability to rapidly scale.
Note: This is the approach I will be using throughout this blog series for the demo application.
Template E: Separate Application and Database Per Tenant
At first blush this doesn't seem that much different from template (A), except that you're going into it with a solid game plan and understand the pros/cons of maintaining multiple instances of your application and database. Because of that necessary complexity involved, I consider this to be an advanced scenario and don't recommend you do this without really doing your homework first.
Certain continuous integration tools, such as Octopus Deploy, or Azure DevOps Pipelines support single app/single database multitenancy. Using a specialized CI pipeline like one of the two mentioned above is the only feasible way I can see doing this, as trying to maintain dozens of application instances and databases manually would turn into a maintenance nightmare.
Note that going this route comes with a surprising number of advantages, such as the fact that you can maintain separate versions of your system for different customers. However, this requires a nontrivial amount of DevOps knowledge and it may not be as cost-effective as template (D) because you are going to pay for each server, cloud application service, and database.
My opinion is that this is best suited to established, medium to large software organizations that already have very large customers and/or do not expect to have to maintain hundreds of tenants.
In this blog entry I touched upon some basic concepts, such as greenfield development vs. brown field development. I gave an introduction to monoliths and microservices and discussed some pros/cons of each. I made an argument that, contrary to popular opinion, building applications as monoliths is not necessarily a bad thing, if those systems are modular in design and scalable, so that transitioning them to a microservice architecture later on is a possibility. Finally, I talked about multitenancy, and laid out a few different architectural templates which can be used to implement multitenant solutions.
Experts / Authorities / Resources
Cesar de la Torre, Bill Wagner, Mike Rousos
This is entry #5 in the Foundational Concepts Series
If you want to view or submit comments you must accept the cookie consent.