Salesforce’s Database in the Clouds

Bookmark and Share is well known as the pioneer of software as a service (SaaS) - the provision of hosted applications across the internet.  Salesforce launched its SaaS CRM (Customer Relationship Management) product more than 10 years ago, and today claims over 70,000 customers.

It's less widely known that also has been a pioneer in platform as a service (PaaS), and is one of the first to provide a comprehensive internet-based application development stack.  In 2007 - way before the current buzz over cloud development platforms such as Microsoft Azure - Salesforce launched the platform, which allowed developers to run applications on the same multi-tenant architecture that hosts the CRM.

Now has gone one step further by exposing its underlying storage engine as a database as a service (DBaaS): is described as an elastically scalable, reliable and secure database made available as a cloud-based service.   It's clearly not the first such offering - Amazon's SimpleDB, Microsoft's SQL Azure and many others make similar claims.  However, because of its existing role in supporting and the CRM, it can instantly claim a larger install base than any other DBaaS.  Although is built on top of the standard stack that includes (Oracle) RDBMS systems, it is not itself relational, and does not support the SQL language. will appear very familiar to developers. Database design is achieved using a visual designer that appears similar to an ER design tool; one creates objects that look like RDBMS tables and define lookup and master detail relationships between those objects.

APIs allow access to objects from a variety of languages, including Ruby, Java, PHP and others.  A REST/JSON interface - which is more convenient for mobile and Web applications - is in pilot stage.  These APIs provide support for basic "CRUD" (Create/Read/Update/Delete) operations. There is also a SQL-like query language - SOQL (Salesforce Object Query Language) - but, like similar query languages in other non-relational stores such as SimpleDB and Google App Engine, it supports a very limited set of operations.  More complicated operations can be written in the interpreted server-side APEX language. 

Salesforce is somewhat vague about the underlying architecture of, but it's fairly well known that, under the hood, the data is stored in one of a number of large Oracle RAC database clusters. objects are implemented on "flex" tables with a uniform structure that consists of simple variable length character columns. API and SOQL calls are translated to efficient SQL calls against the underlying Oracle tables on the fly.

As well as structured queries, maintains a full text index allowing unstructured keyword searches.  Query optimization, resource governors and load balancing techniques are all employed to allow each user of to experience predictable performance, regardless of what other users sharing the same physical Oracle cluster might be doing.

Using a SQL database like Oracle to power a NoSQL solution like might seem odd, but it's not the first time SQL has been used to power NoSQL.   For instance, the initial version of Amazon's NoSQL SimpleDB used MySQL as the underlying storage engine.

There are some reservations about the DBaaS model, however. Traditionally, the database and the application server have been co-located, reducing the network latency for database requests, and allowing both to be protected by the same firewall. Remotely locating the database across the internet - as in the model - enacts a heavy penalty in terms of performance and vulnerability to attack.  It's not clear that there are compelling advantages in the DBaaS model that overcome these very significant drawbacks.