It is undeniable that enabling end to end parallelism of data is easier when you are leveraging a database with MPP architecture. Furthermore, a database that has this type of architecture is certainly in a position to take advantage of all the resources. Thats because the former can parallelize everything, right from data loading to querying and from backups to recoveries. Nevertheless, when you actually look for a database, while focusing on MPP architecture would help, you should also look for a hybrid row/columnar architecture. If the latter is there, youd be able to store data in the format that goes with your requirements.
Simply put, when the database has a hybrid row/columnar architecture, youd be able to use columnar storage in case you require accessing the data for canned BI reports. On the other hand, youd also be able to leverage row-based storage when it comes to exploratory or advanced analytics. Furthermore, in case of analytics, you can really benefit if the database makes the creation of advanced analytics applications really easy. Just so you know you can find out if the database would do so by looking for pre-packaged functions and tools. You are advised to make sure that these functions and tools reduce the time and effort required for delivering deeper insights.
Nevertheless, these tools include an IDE that lets you create SQL and MapReduce applications and at the same time eases out the process of developing, testing, and deploying advanced analytic applications. In fact, it is highly recommended that you only use the database if you know that its going to facilitate fast, easy development of rich analytic applications and would be reducing the code by a significant percentage. Furthermore, when talking of advanced analytics, you cannot rely on SQL alone. Therefore, the database has to help you in getting insights that are richer than the ones achievable through SQL alone.
Given below are some of the other characteristics of the database that you have to look for, while focusing all your energies on analytics:
In-database analytic processing: This type of processing is essential for eliminating the overhead thats typically associated with the movement of large data sets to custom analytic software applications. In fact, if in-database processing is there, you are also likely to enjoy significant performance benefits.
Optimized architecture: You must keep in mind that for performance-optimized big data analytics, you have to ensure that the architecture of the database has been specifically designed for this purpose.