.net - Azure Table Storage Performance from Massively Parallel Threaded Reading -
short version: can read dozens or hundreds of table partitions in multi-threaded manner increase performance orders of magnitude?
long version: we're working on system storing millions of rows in azure table storage. partition data small partitions, each 1 containing 500 records, represents day worth of data unit.
since azure doesn't have "sum" feature, pull year worth of data, either have use pre-caching, or sum data ourselves in azure web or worker role.
assuming following: - reading partition doesn't affect performance of - reading partition has bottleneck based on network speed , server retrieval
we can take guess if wanted sum lot of data on fly (1 year, 365 partitions), use massively parallel algorithm , scale number of threads. example, use .net parallel extensions 50+ threads , huge performance boost.
we're working on setting experiments, wanted see if has been done before. since .net side idle waiting on high-latency operations, seems perfect multi-threading.
there limits imposed on number of transactions can performed against storage account , particular partition or storage server in given time period (somewhere around 500 req/s). in sense, there reasonable limit number of requests execute in parallel (before begin dos attack).
also, in implementation, wary of concurrent connection limits imposed on client, such system.net.servicepointmanager
. not sure if azure storage client subject limits; might require adjustment.
Comments
Post a Comment