Groking Greg Young’s Super Simple CQRS Example

Greg Young has coded up a sample CQRS app with the aim of creating the simplest possible implementation.

I have been looking through the code to try and better understand CQRS.
Greg Young’s Blog
Download the Sample App Here

The diagram above is based on stepping through the code after the user submits a new Inventory Item.

I’m still not sure I fully understand things but it is starting to become clearer.


RDBMS performance implications of NAND Flash and DRAM pricing trends

This article on shows an interesting trend and prediction in the pricing change between DRAM and NAND Flash, as you can see from the chart both are dropping but NAND is dropping faster.

What acutally happened was that NAND prices rose, likely due to demand from companies like Apple, below the first chart shows the price of 8GB MLC NAND Flash from 2007 to today. These modules are commonly used in smaller capacity 2.5″ solid state disk drives. Dram prices also rose as depicted in the second chart below.

MLC Flash 8Gb Module Pricing 2007 – Today

Dram Pricing DDR2 1Gb Module 2007 – Today

There is currently $$$ being spent in building new manufacturing facilities to produce NAND flash so unless new sources of demand appear prices will start to fall again soon. So what does this mean if you are currently designing an enterprise solution that will be deployed to production in the next 12 months and be used for up to the the next 10 years? Well it depends on your application but if it makes significant use of a relational database management system or RDBMS then you can expect a huge decrease in the cost of delivering IOPS.

Calculating IOPS cost

Today you can purchase a FusionIO card for ~$7500 that offers 320GB of storage and a R/W 100k/140k IOPS, a single 15k 450GB SAS drive costs around $300 and will deliver ~175 IOPS to match the mixed  read/write performance of the card you would need to purchase 300 drives, enclosures and controllers, a rough calculation:
15k Disks – 300 x $300 = $90,000
Enclosures – 10 x $5000 = $50,000
Total = $140,000

RAID IOPS Calculator
The problem with the FusionIO solution is that it does not scale easily each card must be installed into a Pci-E slot in a server, most 1U servers will only have one slot. It is also directly attached to the server, when using a SAN storage can be shared amongst servers. If on top of this you are taking advantage of virtualisation it is possible to move VM’s from one physical host to another in realtime and your server will remain online and suffer a small reduction in performance for a few minutes. This makes it possible to take physical machines offline for maintenance without affective the availablity of your application. FusionIO does not offer fault tolerance if the physical machine that the card is installed in fails then you lose access to that storage until the machine becomes available again. In this situation you must have a mirror machine available with a recent copy of your data. But this is still a good value solution at $15,000 vs $140,000.

The water is a little bit murky

It is still not clear exactly how vendors are going to deliver the level of IOPS performance offerd by FusionIO in a SAN based format. The problem seems to center around the speed of interfaces like iSCSI and the bandwidth and latency of gigabit ethernet. Although the storage devices already exist there is a lot of work needed to improve the pipeline between hosts and SAN. That said this is a solvable problem, it’s not “if” but “when”.

The impact on database performance

RDBMS’s like Sql Server, MySql and Oracle are going to get a huge performance boost as NAND Flash based storage systems come into use, typically IO has been the bottleneck in any database based application in future this will likely change to CPU. Oracle and Sql Server are both licensed per CPU/Core so when building applications this should be taken into account. Any operation that requires a significant amount of CPU time should not be perfromed by the database unless absolutely necessary. Databases should just be used for what they were originaly built and that is to persist data and allow it to be retrieved. The need for inline caches should be reduced due to this performance boost.