Why we love MongoDB at DataGenic
I first heard about MongoDB during the ETOT Data Management Workshop (link to post) from our CTO, Colin Hartley, back in October last year. The workshop covered trends and best practices in data management and its importance in the wider business intelligence context across an organisation.
MongoDB continues to grow in popularity
One of Colin’s slides showed databases ranked by popularity, with MongoDB taking 4th place behind DB dinosaurs Oracle, MySQL and Microsoft SQl Server. Not bad considering MongoDB has only been around for 8 years.
With two large scale projects underway – US ISO Power Data & Sentiment Analysis – I decided to catch up with our resident MongoDB expert (and fan boy), Kowshik NS, to find out what makes MongoDB so special and what it means for our commodity data management solutions and clients.
Frank: Hi Kowshik, thanks for taking some time out of your busy schedule to talk to me today. Can you tell me how MongoDB differs from traditional database applications and why you and the guys in the development team are so excited about its potential?
Kowshik: The major difference is how MongDB stores data. Unlike the major relationship databases which store information in relational tables and require schemas to store, access, retrieve and interrogate data, MongoDB is essentially a document store which offers operational flexibility as well as allowing for changes at the application level which would either be cumbersome when using traditional relational DBMS.
Frank: So talking about the benefits in more detail, I know you are currently working on a US ISO Power Database. How is MongoDB making that job easier for you?
Kowshik: ISO stands for Independent System Operators and they are organisations formed at the direction or recommendation of the Federal Energy Regulatory Commission (FERC). In the areas where an ISO is established, it coordinates, controls and monitors the operation of the electrical power system, usually within a single US State, but sometimes encompassing multiple states. The ISO Power Data stands out with its high volume, high frequency and latency demands. The choice to store those high data volumes as NoSQL and with the number 1 ranked NoSQL system, MongoDB, was an easy choice.
One of MongoDB’s strength is its aggregation of data records, a necessary feature when interogating large sets of historical data. While we were talking I’ve just ran an aggregation across 23 million documents, each containing at least 12 data points; this took less than 0.3 seconds.
Frank: So aside from its flexibility and ability to aggregate large datasets in record time, are there any other benefits to using MongoDB here at DataGenic?
Kowshik: Most definitely – scalability! We are using an approach called MongoDB Sharding. This approach more or less automates the process of horizontal scaling, by distributing data across nodes and clusters to ensure 100% operational uptime. It means we can grow the database size without disturbing the existing live production environment, something that is mission critical for a number of our globally operating clients.
In addition to actual features of MongoDB, I really appreciate and value their active user community, both online and at their regular events. We attended the MongoDB Europe event in November last year, it was great to hear more about the product roadmap, attend a ‘clinic session’ where we could talk with one of their experts, and hear some interesting use case on how its customers make the most of MongoDB.
— Andrew Marsland (@CtrlAltEtc) November 15, 2016
Frank: That sounds great, I feel the same way about HubSpot and their annual user conference, it’s a great way to get new ideas and take them back in your day-to-day job. Any particular customer showcase that stood out for you?
Yes, Barclays gave a a great talk about how they reduce reliance on mainframe systems by moving customer data onto a more agile and resilient MongoDB platform. I believe this got covered in a ComputerWorld article, and what I found most interesting was the fact that one of the banks use cases was the ‘serving up’ of historical transaction data to its customers. Something that proved difficult in the past when “online transaction histories were limited to 300 entries, due to the difficulty in retrieving the huge amount of data from the mainframe systems.” This isn’t too dissimilar from us trying to analyse large sets historical ISO data for clients to interrogate and analyse.
Frank: That sounds amazing! Thank you for taking the time to give me a glimpse at MongoDB and how it is helping us better handle and serve big data.