nosql: MongoDB, Cassandra or alternative for data warehousing -


i stuck between concrete decision on whether go mongodb or cassandra database needs , input on use case guide decision.

requirements:

data source

  • x datacenters containing y servers.
  • each server has n networks , m statistics.

e.g. ( 3 datacenters, 50 total servers, 19 networks , 10 stats ). these numbers increase on time.

data fetching:

  • parse xml page each server (~20kb / page ) every hour. (~25mb / day )

data storage:

  • organized (hourly,daily, monthly) structure using aggregation find higher values (hours -> day )

note: need ability to:

  • dynamically add / remove values ( datacenters / servers / networks / statistics ) , scale-ability key issue, hence moving sql on nosql.
  • reliability high priority ( master / slave, no corruption ) , require "easy" maintainability.
  • writing hourly, no need "massive" writing performance.

example use case: on front-end query so, select; date window, period report, specific datacenter, specific/all networks, specific/all statistics , whether results totalled or individual across servers.

example #1   - from: august 16th 2012 -> april 16th 2013  - period: daily  - data-center: eu  - stat-type: error  - servers: 

from reading similar articles across stack-overflow , web, i've come conclusion best bet may mongodb flexible queries , closeness relational database. cassandra seems option if writes of higher volumes - although column based model. novice database design , management ease of use factor (still cs student).

from use cases nosql database best option?

you pretty nailed in conclusion. make mind have chose between perks of each db, :

cassandra :

  • better availability (master/master no spof)
  • better scalability : (linear, elastic)
  • better writes performance

mongodb :

  • better queries (api , native full text search)
  • ease of use (variety of api, xml/json...)

consistence isn't of issue guess , anyway they're both consistent. if mongodb easier started (closer relationnal data model), cassandra isn't hard either, have understand column oriented paradigm. anyway technical point of view, guess answer depends on how expect system grow in size , if requests evolve or not.


Comments

  1. Thank you for posting an excellent content, this is the best content I have never seen which is related to current technology. Refer me for IT studies and language courses.
    Regards:
    Hadoop Training Chennai | Hadoop Training in Chennai

    ReplyDelete

Post a Comment

Popular posts from this blog

ASP.NET/SQL find the element ID and update database -

c++ - Compiling static TagLib 1.6.3 libraries for Windows -

PostgreSQL 9.x - pg_read_binary_file & inserting files into bytea -