Cloud Platforms

Approaches to Multi-tenancy

Adrian Wright

You're building a SAAS multi-tenant system. Congratulations! Each system has varied requirements, but here are a few guidelines for design decisions, based on recent projects we've done.

Data Storage Options

One of the biggest choices to make in a multi-tenant system is how you'll store the data. There are a few primary options, and I'll discuss some high level pros and cons to each:

Separate database per tenant

Each tenant gets a dedicated database. Databases can be on a single engine or multiple engines. This option is often considered for its security benefits -- each customer's data is in a silo. We'll discuss ways to manage a shared database later.

  • Flexibility at the database level -- For example, databases can be backed up separately. If customers want a backup of their data, or backups need to be performed at different times of day depending on periods of inactivity, this option gives the most flexibility. Backups can be stored for different lengths of time based on SLA.
  • Custom Data Schemas -- A separate database puts you in the best position to support custom schemas, if you need to.
  • Be aware of manageability concerns -- Additional database creates more overhead for DBA's. Create a fully automated process for apply schema changes.
  • Using ORM's -- ORM's can be a very good option here, although you'll need a unique session factory per tenant. Standard disclaimers about the potential pitfalls of ORM's apply.

Separate schema per tenant

Data is stored in a single database, but each tenant gets their own schema. This provides some level of separation for tenant data, while easing the administrative pain of multiple databases.

  • Ease of Administration -- This option has a significant advantage over single database, since you only have one (or a handful) of databases to manage.
  • Custom data schemas -- This is similar to the previous option, in that it provides strong support for , although this adds a lot of potential complexity to the application.
  • Security -- Although your data is in separate schemas, you may want to create a database user per-tenant and limit access to a single schema (this would also work in a separate database model).
  • Schema modifications -- These can be unwieldy, as change scripts need to be duplicated for each schema. Design an automated process for applying the same change to different schemas.

Single database, single schema

All data is shared in a single schema in a single database. This model has done well for Salesforce over the years.

  • Simple database manageability is the big win here but puts all the honus on the application or DBA to enforce usage of tenant columns on all queries.
  • Tenant columns are a must, and should be placed on every table, with the exception of reference data and some cross-tables.
  • Automatic filters and listeners are good options for enforcing adherence to the tenant columns. NHibernate's are documented here and here. Many large ORM's have similar features, but don't expect it if you're using a MicroORM. A meticulous DBA can also play the role of applying tenant filters if you're using sprocs, and can do the best job of query tuning for large data sets.
  • Be ready for database performance tuning. With all data in a single schema, you may need to be ready to fall back to multiple databases after a tipping point has been reached. Table partitioning may also prove helpful for data that can be "aged out". As we all know, query tuning is an achilles heel for ORM's so if you go that route, be ready to fall back to sprocs for highly performant queries.
  • Custom data fields are a challenge, although there are creation solutions available. Generic key/value pairs and XML data columns are two options. Ayende has a good summary here, and Izenda has a cogent discussion here.

 

By the way, if you want to read and exhaustive study of the various data storage options, check out Microsoft's summary here.

Other considerations for the start of your project

A few random things I've learned, that might be good to consider from the get-go.

  • Treat tenant security as a first-class citizen -- In addition to the query filters discussed earlier, be aware of other tenant security considerations. For example, once the user has logged into a tenant, the application can take over tenancy from there. Rather than making the client side pass in a tenant Id on every POST, pull it in from session. This is simpler for the client, and more secure.
  • Dedicate time to test data generation -- Test data is really hard. A single database per tenant is nice because you don't have to worry about tenant columns and separate schemas. However, you still need to account for the ways in which your tenant data will vary wildly. Time should be spent to predict (guess?) what tenant data will look like so that a representative test data set can be built. Existing data from legacy systems is a great way to understand how data will be used, and can act as a data source if you have a migration built already. Single database, single schema provides a unique hurdle, as populating the tenant columns correctly on all records is a challenge, especially if you have to work with multiple levels of multi-tenancy.
  • Application customization tradeoffs-- Allowing application customizations per tenant can be a big win for the flexibility of your platform. However, it adds a lot of overhead and needs to be architected with the end goal in mind. A simple strategy pattern can provide sufficient separation for specific business logic required by a tenant, or you may require something more complex like MEF or a Rules Engine.
Adrian Wright
ABOUT THE AUTHOR

Distinguished Technical Consultant