mshank on Tue, 16 May 2017 23:27:08


Basic Analytics application/service (runs analysis against large data sets) on Azure. Uses sql azure database and azure storage.

Can anyone point to best practices (Azure sql design patterns)  for designing a scalable, multi-tenant application and can provide the best performance on the UI layer.

The DB is designed in a temporal way...for every new data set, new tables are created in the DB and when storage runs out, new databases are created.  - the intent of this design is that multiple tables and smaller databases (500GB to 1 TB in size) is better to run queries/analysis against.

Can anyone help critic this architecture. What are the issues with this design? Is this design scalable

Thank you



Martin.Cairney on Wed, 17 May 2017 02:14:30

If you are running analytics on large data sets then I'd look at either SQL DW, Data Lake Analytics or something in the HDInsight stack.

Adding new SQL DB databases could be overly complex - how do you select the correct Tier for each new database? Do you always have separation of Tenants?

Using SQL DW or one of the Big Data analytics can mean you don't have to worry about the capacity and can then focus on the functionality aspects.

mshank on Wed, 17 May 2017 04:24:45

Hi Martin

Thank you for the response. Yes, SQL DW / other big data technologies are a consideration.

To explain simplistically - Currently, there is a single DB in the system housing all tenant content and there are multiple sql db's housing tenant specific content (different tables for different tenants). As space runs out, more DB's are created (and there is a mechanism to know which DB has what content from what tenant). Any query or any analysis that needs to be run directly goes to the DB which houses the content (based on mapping) and fetches results and then sends it up for processing / analysis. 

Based on the above (and ignoring the fact that there are options to go to SQL DW or other options), what would you say is the design flaw in the current system? What happens as the application scales - i.e. more datasets and more tenants ...which means more tables and more databases.