There are a number of database sharding strategies that meet the diverse requirements of different categories of application.
Shard by Modulus
For many applications, it is appropriate to shard based on a shard key such as a User ID. Using a modulus of a numeric ID, especially an auto increment primary key, ensures even distribution of data between shards.
Shard by Date/Time Range
For time-based data such as feeds or blogs where data is accumulating over time, it may make sense to shard by date range. For example, each shard could contain data for a single month. New shards can be added each month and old shards can be dropped once historic data is no longer needed.
Master Lookup
It is sometimes a requirement to control the sharding manually or in an application specific manner. One example would be a requirement to host key customer accounts on shards hosted on higher specification hardware. To support this requirement, a master shard can be created which contains lookup tables to map customer IDs to a specific shard number.
Session-based Sharding
Some categories of application, particularly user-centric web applications, can choose a shard when a customer logs in and then direct all queries to the same shard for the duration of that user session.
Fixed Shard
Tables are mapped to specific fixed shards. Also known as table based sharding.
Custom Sharding
If there is any specific logic for sharding, a piece of code can be used to shard data based on that logic.
Global Tables
Global tables are tables which are hosted in all shards and data is automatically replicated across all shards. The benefit is that these tables can be used in joins with sharded tables in each shard. Global tables are typically fairly static tables or with low write volume, such as product codes, countries, and other reference data.
Showing posts with label database theory. Show all posts
Showing posts with label database theory. Show all posts
Tuesday, January 25, 2011
Wednesday, September 02, 2009
DB Basics : Types of joins
Join Types
So, you have designed a normalized database design for your application. Maybe up to the 3rd normal form. And now, when you need to run queries, you would need to join the tables in the query to get the required information. There has to be some common data that allow those tables to be connected in some meaningful way. Although it’s possible to have more than one common column between two tables, most often, the join key will be the primary key of one table and a foreign key in the other.
Lets perform queries against the following table.
company : company_id (pk), company_name
model : model_id(pk), company_id(fk), model_name
Here, we have a company table and a model table in 1:n relationship. A car company can have multiple models in the market. Let us take some sample data
Inner join : An inner join is defined as a join in which rows must match in both tables in order to be included in the result set.
mysql> select t1.company_name as 'Manufacturer', t2.model_name as 'Model' from company t1 INNER JOIN model t2 on t1.company_id = t2.company_id where t1.company_name = 'Audi';
+--------------+-------+
| Manufacturer | Model |
+--------------+-------+
| Audi | A4 |
| Audi | A6 |
| Audi | A8 |
| Audi | Q7 |
+--------------+-------+
Outer Join : Outer joins will return records in one table that aren’t matched in another. Outer joins can be further divided into the two types of left and right. In a left outer join, all records from the first (left-hand) table in a join that meet any conditions set in the WHERE clause are returned, whether or not there’s a match in the second (right-hand) table.
mysql> select t1.company_name as 'Manufacturer', t2.model_name as 'Model' from company t1 left join model t2 on t1.company_id = t2.company_id where t1.company_name in ('Toyota','BMW');
+--------------+--------+
| Manufacturer | Model |
+--------------+--------+
| Toyota | Corola |
| Toyota | Camry |
| Toyota | Innova |
| BMW | NULL |
+--------------+--------+
Here 'BMW' is returned even when it does not have any entry in the model Table.
Similar to the left outer join, a right outer join returns all records from the second (right-hand) table in a join that meet any conditions set in the WHERE clause, whether or not there’s a match in the first (left-hand) table.
mysql> select t1.company_name as 'Manufacturer', t2.model_name as 'Model' from company t1 right join model t2 on t1.company_id = t2.company_id where t2.model_name in ('Custom','Nano');
+--------------+--------+
| Manufacturer | Model |
+--------------+--------+
| Tata | Nano |
| NULL | Custom |
+--------------+--------+
Cross-join :
The cross-join, also referred to as a Cartesian product, returns all the rows in all the tables listed in the query. Each row in the first table is paired with all the rows in the second table. This happens when there is no relationship defined between the two tables. We do not require cross join in our general applications, so we should try to avoid it. A cross join happens when we fail to provide a filler for the join in the query.
mysql> select t1.company_name as 'Manufacturer', t2.model_name as 'Model' from company t1, model t2;
+--------------+-----------+
| Manufacturer | Model |
+--------------+-----------+
| Audi | A4 |
| GM | A4 |
| Ford | A4 |
| Toyota | A4 |
| Tata | A4 |
| BMW | A4 |
| Audi | A6 |
| GM | A6 |
.....
.....
.....
| Ford | Custom |
| Toyota | Custom |
| Tata | Custom |
| BMW | Custom |
+--------------+-----------+
So, you have designed a normalized database design for your application. Maybe up to the 3rd normal form. And now, when you need to run queries, you would need to join the tables in the query to get the required information. There has to be some common data that allow those tables to be connected in some meaningful way. Although it’s possible to have more than one common column between two tables, most often, the join key will be the primary key of one table and a foreign key in the other.
Lets perform queries against the following table.
company : company_id (pk), company_name
model : model_id(pk), company_id(fk), model_name
Here, we have a company table and a model table in 1:n relationship. A car company can have multiple models in the market. Let us take some sample data
company_id | company_name |
1 | Audi |
2 | GM |
3 | Ford |
4 | Toyota |
5 | Tata |
6 | BMW |
model_id | company_id | model_name |
1 | 1 | A4 |
2 | 1 | A6 |
3 | 1 | A8 |
4 | 1 | Q7 |
5 | 2 | Chevrolet |
6 | 2 | Hummer |
7 | 3 | Ikon |
8 | 3 | Endeavor |
9 | 3 | Fiesta |
10 | 4 | Corola |
11 | 4 | Camry |
12 | 4 | Innova |
13 | 5 | Indica |
14 | 5 | Nano |
15 | 5 | Safari |
16 | Null | Custom |
Inner join : An inner join is defined as a join in which rows must match in both tables in order to be included in the result set.
mysql> select t1.company_name as 'Manufacturer', t2.model_name as 'Model' from company t1 INNER JOIN model t2 on t1.company_id = t2.company_id where t1.company_name = 'Audi';
+--------------+-------+
| Manufacturer | Model |
+--------------+-------+
| Audi | A4 |
| Audi | A6 |
| Audi | A8 |
| Audi | Q7 |
+--------------+-------+
Outer Join : Outer joins will return records in one table that aren’t matched in another. Outer joins can be further divided into the two types of left and right. In a left outer join, all records from the first (left-hand) table in a join that meet any conditions set in the WHERE clause are returned, whether or not there’s a match in the second (right-hand) table.
mysql> select t1.company_name as 'Manufacturer', t2.model_name as 'Model' from company t1 left join model t2 on t1.company_id = t2.company_id where t1.company_name in ('Toyota','BMW');
+--------------+--------+
| Manufacturer | Model |
+--------------+--------+
| Toyota | Corola |
| Toyota | Camry |
| Toyota | Innova |
| BMW | NULL |
+--------------+--------+
Here 'BMW' is returned even when it does not have any entry in the model Table.
Similar to the left outer join, a right outer join returns all records from the second (right-hand) table in a join that meet any conditions set in the WHERE clause, whether or not there’s a match in the first (left-hand) table.
mysql> select t1.company_name as 'Manufacturer', t2.model_name as 'Model' from company t1 right join model t2 on t1.company_id = t2.company_id where t2.model_name in ('Custom','Nano');
+--------------+--------+
| Manufacturer | Model |
+--------------+--------+
| Tata | Nano |
| NULL | Custom |
+--------------+--------+
Cross-join :
The cross-join, also referred to as a Cartesian product, returns all the rows in all the tables listed in the query. Each row in the first table is paired with all the rows in the second table. This happens when there is no relationship defined between the two tables. We do not require cross join in our general applications, so we should try to avoid it. A cross join happens when we fail to provide a filler for the join in the query.
mysql> select t1.company_name as 'Manufacturer', t2.model_name as 'Model' from company t1, model t2;
+--------------+-----------+
| Manufacturer | Model |
+--------------+-----------+
| Audi | A4 |
| GM | A4 |
| Ford | A4 |
| Toyota | A4 |
| Tata | A4 |
| BMW | A4 |
| Audi | A6 |
| GM | A6 |
.....
.....
.....
| Ford | Custom |
| Toyota | Custom |
| Tata | Custom |
| BMW | Custom |
+--------------+-----------+
Subscribe to:
Posts (Atom)