Skip to content

Commit

Permalink
restructure documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
shimizukawa committed Jul 23, 2018
1 parent f978dc8 commit a6ff776
Show file tree
Hide file tree
Showing 8 changed files with 308 additions and 430 deletions.
6 changes: 4 additions & 2 deletions AUTHORS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
AUTHORS
=======

* Takayuki Shimizukawa <[email protected]>

* Takayuki Shimizukawa <https://github.com/shimizukawa>
* Kosei Kitahara <https://github.com/Surgo>
* Evandro Myller <https://github.com/emyller>
* Maxime Vdb <https://github.com/m-vdb>

8 changes: 5 additions & 3 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,10 @@ CHANGES
0.9 (Unreleased)
----------------

* Drop support for Django 1.8, 1.9 and 1.10

* #35: Drop support for Django 1.8, 1.9 and 1.10.
* #40: Support Django 2.0.
* #42: Support DISTKEY. Thanks to Benjy Weinberger.
* Documentation: http://django-redshift-backend.rtfd.io/

0.8.1 (2018-06-19)
------------------
Expand All @@ -17,7 +19,7 @@ CHANGES

Incompatible Changes:

* #23,#10 Redshift support time zones in time stamps for migration
* #23,#10: Redshift support time zones in time stamps for migration

**IMPORTANT**:
With this change, the newly created DateTimeField column will be timestamp
Expand Down
323 changes: 5 additions & 318 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,61 +2,10 @@
Redshift database backend for Django
====================================

This product is tested with:
This is a Redshift database backend for Django.

* python-2.7, 3.5, 3.6
* django-1.11, 2.0


Differences from postgres_psycopg2 backend
==========================================

Type mapping:

* 'integer identity(1, 1)' for AutoField
* 'bigint identity(1, 1)' for BigAutoField
* 'timestamp with time zone' for DateTimeField
* 'varchar(max)' for TextField
* 'varchar(32)' for UUIDField
* Possibility to multiply VARCHAR length to support utf-8 string, using
`REDSHIFT_VARCHAR_LENGTH_MULTIPLIER` setting.

Stop using:

* RETURNING (single insert and bulk insert)
* SELECT FOR UPDATE
* SELECT DISTINCT ON
* SET CONSTRAINTS
* INDEX
* DEFERRABLE INITIALLY DEFERRED
* CONSTRAINT
* CHECK
* DROP DEFAULT

To support migration:

* To add column to existent table on Redshift, column must be nullable
* To support modify column, add new column -> data migration -> drop old column -> rename

Please note that the migration support for redshift is not perfect yet.

Note and Limitation
--------------------

Amazon Redshift doesn't support RETURNING, so ``last_insert_id`` method retrieve MAX(pk) after insertion as a workaround.

refs:

* http://stackoverflow.com/q/19428860
* http://stackoverflow.com/q/25638539

In some case, MAX(pk) workaround does not work correctly.
Bulk insertion makes non-contiguous IDs like: 1, 4, 7, 10, ...
and single insertion after such bulk insertion generates strange id value like 2 (smallest non-used id).


SETTINGS
========
Django settings
===============

ENGINE for DATABASES is 'django_redshift_backend'. You can set the name in your settings.py as::

Expand All @@ -71,276 +20,14 @@ ENGINE for DATABASES is 'django_redshift_backend'. You can set the name in your
}
}

REDSHIFT_VARCHAR_LENGTH_MULTIPLIER:
Possibility to multiply VARCHAR length to support utf-8 string. Default is 1.

Using sortkey
---------------------------------

There is built-in support for this option for Django >= 1.9. To use `sortkey`, simply define an `ordering` on the model meta as follow::

class MyModel(models.Model):
...

class Meta:
ordering = ['col2']

N.B.: there is no validation of this option, instead we let Redshift validate it for you. Be sure to refer to the `documentation <http://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_TABLE_examples.html>`_.

Using distkey
---------------------------------

There is built-in support for this option for Django >= 1.11. To use `distkey`, define an index on the model
meta with the custom index type `django_redshift_backend.distkey.DistKey` with `fields` naming a single field::

class MyModel(models.Model):
...

class Meta:
indexes = [DistKey(fields=['customer_id'])]

Redshift doesn't have conventional indexes, and we don't generate SQL for them. We merely use
`indexes` as a convenient place in the Meta to identify the `distkey`.

You will likely encounter the following complication:

Inlining Index Migrations
~~~~~~~~~~~~~~~~~~~~~~~~~
Django's `makemigrations` generates a migration file that first applies a `CreateModel` operation without the
`indexes` option, and then adds the index in a separate `AddIndex` operation.

However Redshift requires that the `distkey` be specified at table creation. As a result, you may need to
manually edit your migration files to move the index creation into the initial `CreateModel`.

That is, to go from::

operations = [
...
migrations.CreateModel(
name='FactTable',
fields=[
('distkeycol', models.CharField()),
('measure1', models.IntegerField()),
('measure2', models.IntegerField())
...
]
),
...
migrations.AddIndex(
model_name='facttable',
index=django_redshift_backend.distkey.DistKey(fields=['distkeycol'], name='...'),
),
]

To::

operations = [
...
migrations.CreateModel(
name='FactTable',
fields=[
('distkeycol', models.CharField()),
('measure1', models.IntegerField()),
('measure2', models.IntegerField())
...
],
options={
'indexes': [django_redshift_backend.distkey.DistKey(fields=['distkeycol'], name='...')],
},
),
...
]


Inlining ForeignKey Migrations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It is common to distribute fact tables on a foreign key column referencing the primary key of a dimension table.

In this case you may also encounter the following added complication:

Django's `makemigrations` generates a migration file that first applies a `CreateModel` operation without the
`ForeignKey` column, and then adds the `ForeignKey` column in a separate `AddField` operation. It does this to
avoid attempts to create foreign key constraints against tables that haven't been created yet.

However Redshift requires that the `distkey` be specified at table creation. As a result, you may need to
manually edit your migration files to move the ForeignKey column into the initial `CreateModel`, while also
ensuring that the referenced table appears *before* the referencing table in the file.

That is, to go from::

operations = [
...
migrations.CreateModel(
name='FactTable',
fields=[
('measure1', models.IntegerField()),
('measure2', models.IntegerField())
...
]
),
...
migrations.CreateModel(
name='Dimension1Table',
fields=[
...
]
),
...
migrations.AddField(
model_name='facttable',
name='dim1',
field=models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='myapp.Dimension1Table'),
),
...
]

To::
For more information, please refer: http://django-redshift-backend.rtfd.io/

operations = [
migrations.CreateModel(
name='Dimension1Table',
fields=[
...
]
),
...
migrations.CreateModel(
name='FactTable',
fields=[
('measure1', models.IntegerField()),
('measure2', models.IntegerField()),
('dim1', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='myapp.Dimension1Table'))
...
]
),
...
]



TESTING
=======

Testing this package requires:

* tox-1.8 or later
* virtualenv-15.0.1 or later
* pip-8.1.1 or later

LICENSE
=======
Apache Software License


CHANGES
=======

0.9 (Unreleased)
----------------

* #35: Drop support for Django 1.8, 1.9 and 1.10.
* #40: Support Django 2.0.
* #42: Support DISTKEY. Thanks to Benjy Weinberger.

0.8.1 (2018-06-19)
------------------

* #38: Fix 0.8 doesn't compatible with Python 2. Thanks to Benjy Weinberger.

0.8 (2018-06-01)
----------------

Incompatible Changes:

* #23,#10: Redshift support time zones in time stamps for migration

**IMPORTANT**:
With this change, the newly created DateTimeField column will be timestamp
with timezone (TIMESTAMPTZ) by migration. Therefore, the existing
DateTimeField and the new DateTimeField will have different data types as a
redshift schema column type.
There are no migration feature by django-redshift-backend.
see also: https://github.com/shimizukawa/django-redshift-backend/pull/23

New Features:

* #20,#26: Support for sortkey. Thanks to Maxime Vdb and Kosei Kitahara.
* #24: Add UUIDField support. Thanks to Sindri Guðmundsson.
* #14: More compat with redshift: not use SELECT DISTINCT ON.

Bug Fixes:

* #15,#21: More compat with redshift: not use CHECK. Thanks to Vasil Vangelovski.
* #18: Fix error on migration with django-1.9 or later that raises AttributeError
of 'sql_create_table_unique'.
* #27: annotate() does not work on Django-1.9 and later. Thanks to Takayuki Hirai.


Documentation:

* Add documentation: http://django-redshift-backend.rtfd.io/


0.7 (2017-06-08)
----------------

* Drop Python-3.4
* Drop Django-1.7
* Support Python-3.6
* Support Django-1.11

0.6 (2016-12-15)
----------------

* Fix crush problem when using bulk insert.

0.5 (2016-10-05)
----------------

* Support Django-1.10
* #9: Add support for BigAutoField. Thanks to Maxime Vdb.
* Fix crush problem on sqlmigrate when field modified.

0.4 (2016-05-17)
----------------

* Support Python-3.4 and 3.5
* #7: Restore support django-1.7. Version 0.3 doesn't support django-1.7.
* #4: More compat with redshift: not use SET CONSTRAINTS. Thanks to Maxime Vdb.
* #6: More compat with redshift: not use sequence reset query. Thanks to Maxime Vdb.
* #5: Add REDSHIFT_VARCHAR_LENGTH_MULTIPLIER settings. Thanks to Maxime Vdb.
* Support column type changing on migration.

0.3 (2016-05-14)
----------------

* #3: more compat with Redshift (AutoField, DateTimeField, Index). Thanks to Maxime Vdb.
* More compat with redshift: add TextField
* More compat with redshift: not use DEFERRABLE, CONSTRAINT, DROP DEFAULT
* More compat with redshift: support modify column


0.2.1 (2016-02-01)
------------------

* "SET TIME_ZONE" warning is changed as debug log for 'django.db.backend' logger.

0.2 (2016-01-08)
----------------

* Disable "SET TIME_ZONE" SQL execution even if settings.TIME_ZONE is specified.

0.1.2 (2015-06-5)
-----------------

* Support Django-1.8

0.1.1 (2015-03-27)
------------------
* Disable "SELECT FOR UPDATE" SQL execution.
.. CHANGES.rst will be concatenated here by setup.py
0.1 (2015-03-24)
----------------
* Support Django-1.7
* Support "INSERT INTO" SQL execution without "RETURNING" clause.
Loading

0 comments on commit a6ff776

Please sign in to comment.