The real secret sauce of SlicingDice is called S1Search, a Java-based Data Warehouse and Analytics Database developed by a very experienced team that have been handling an astonishing volume of data since 2012. S1Search was strongly inspired by Apache Lucene’s (and ElasticSearch’s) data insertion techniques and search engine concepts — such as Inverted Index — , also combining well-known concepts from other successful performant databases — like column-store data modeling.
Databases are usually classified according to the abstraction they provide to the system using it. Many traditional databases follow a so called relational model which is backed by sound mathematical theory. Although broadly adopted, alternatives to this model have been growing consistently from the 2000’s on as a multitude of databases were created with new models that allow cheaper management of big data.
The so called non-relational databases span a list of categories that include: key-value store, document store, graph database, wide-column store and others. With so many competing models on the database ecosystem, it should be clear that a definitive solution has not yet been found and that there is still space for innovation. This is the case of S1Search: although similarities can be traced, none of the known models are exactly a description to what S1Search implements.
A key difference of S1Search to traditional databases is regarding which columns are used to keep an index and which are used to keep a materialized view. Traditional databases keep a materialized view of every column and indexes of few columns whereas S1Search keeps indexes of all columns and materialized view of few ones.
More Compression = Less Storage Costs = Lower Prices
Data compression is one of the parts where our developers have put a huge amount of time and effort. Because of that, it’s common for us to see compression ratio between 1/10 and even 1/30 of the original data size when inserting it on S1Search.
There is no magic about what we do and the logic is pretty straightforward: as we can store our customer's data using just a fraction of its original size, it cost us much less than any other solution, so we can be much more aggressive on pricing.
Due the economy of scale, the more customers we get, more S1Search servers we add, making our average storage cost even lower, consequently allowing us to decrease our prices.
More Compression = Less IO = Faster Queries
A known fact regarding primary memory and storage devices is that access to the former is around 1 million times faster than to the latter when considering hard drives and 10 thousand times faster when considering modern SSDs.
Therefore, efforts to reduce the amount of data transferred to and from disks are usually worth the price — and this is precisely the point of compressing data. At the cost of adding CPU cycles to compress and uncompress data, huge gains on the size of transferred data can be achieved by setting a proper compression scheme. But one scheme won’t fit them all.
Since every type of data has its own characteristics, different compression protocols might be required to achieve optimal compression rates, and we’ve implemented a many of them on S1Search.