RavenDB Sharding – Map/Reduce in a cluster
In my previous post, I introduced RavenDB Sharding and discussed how we can use sharding in RavenDB. We discussed both blind sharding and data driven sharding.
Today I want to introduce another aspect of RavenDB Sharding. The usage of Map/Reduce to gather information from multiple shards.
We start by defining a map/reduce index. In this case, we want to look at the invoice totals per date.
We define the index like this:
public class InvoicesAmountByDate : AbstractIndexCreationTask<Invoice, InvoicesAmountByDate.ReduceResult>
{
public class ReduceResult
{
public decimal Amount { get; set; }
public DateTime IssuedAt { get; set; }
}
public InvoicesAmountByDate()
{
Map = invoices =>
from invoice in invoices
select new
{
invoice.Amount,
invoice.IssuedAt
};
Reduce = results =>
from result in results
group result by result.IssuedAt
into g
select new
{
Amount = g.Sum(x => x.Amount),
IssuedAt = g.Key
};
}
}
And then we execute the following code:
using (var session = documentStore.OpenSession())
{
var asian = new Company { Name = "Company 1", Region = "Asia" };
session.Store(asian);
var middleEastern = new Company { Name = "Company 2", Region = "Middle-East" };
session.Store(middleEastern);
var american = new Company { Name = "Company 3", Region = "America" };
session.Store(american);
session.Store(new Invoice { CompanyId = american.Id, Amount = 3, IssuedAt = DateTime.Today.AddDays(-1)});
session.Store(new Invoice { CompanyId = asian.Id, Amount = 5, IssuedAt = DateTime.Today.AddDays(-1) });
session.Store(new Invoice { CompanyId = middleEastern.Id, Amount = 12, IssuedAt = DateTime.Today });
session.SaveChanges();
}
We use a three way sharding, based on the region of the company, so we actually have the following document sin three different servers:
First server, Asia:
Second server, Middle East:
Third server, America:
Now, let us see what happen when we use the map/reduce query:
using (var session = documentStore.OpenSession())
{
var reduceResults = session.Query<InvoicesAmountByDate.ReduceResult, InvoicesAmountByDate>()
.ToList();
foreach (var reduceResult in reduceResults)
{
string dateStr = reduceResult.IssuedAt.ToString("MMM dd, yyyy", CultureInfo.InvariantCulture);
Console.WriteLine("{0}: {1}", dateStr, reduceResult.Amount);
}
Console.WriteLine();
}
As you can see, again, we make no distinction in our code about using sharding, we just query it normally. The results, however, are quite interesting:
As you can see, we got the correct results, cluster wide.
RavenDB was able to query all the servers in the cluster for their results, reduce them again, and get us the total across all three servers.
And that, my friends, it truly awesome.
Reference: RavenDB Sharding–Map/Reduce in a cluster from our NCG partner Oren Eini at the Ayende @ Rahien blog.