Attention

This library is used for a few apps in production, but it is still early in development. Like the idea of it? Please star us on GitHub and contribute via the issues board and roadmap.

Django GCP

django-gcp is a library of tools to help you deploy and use django on Google Cloud Platform. Helpers are provided for:

Aims

The ultimate goals are to:

  • Allow serverless django (for actual fully-fledged apps, not toybox tutorials).

  • Enable event-based integration between django and various GCP services.

  • Simplify the use of GCP resources in django including Storage, Logging, Erorr Reporting, Run, PubSub, Tasks and Scheduler.

Tip

For example, if we have both a Store and a PubSub subscription to events on that store, we can do smart things in django when files or their metadata change.

Background

To run a “reasonably comprehensive” django server on GCP, we have been using 4-5 libraries. Each covers a little bit of functionality, and we put in a lot of time to:

engage maintainers -> fork -> patch -> PR -> wait -> wait more -> release (maybe) -> update dependencies

Lots of the maintainers of those libraries have given up or are snowed under, which we have a lot of compassion for. Some, like django-storages, are (admirably) maintaining a uniform API across many compute providers, whereas we don’t change providers often enough to need that, so would rather have the flexibility to do platform-specific things.

We’ll be using GCP for the foreseeable future, so can accept a platform-specific API in order to use latest GCP features and best practices.

Contents

Getting Started

Tip

A complete example of a working server with django-gcp is provided in the tests folder of the source code.

Install the library

django-gcp is available on pypi, so installation into your python virtual environment is dead simple:

poetry add django-gcp

Not using poetry? It’s highly opinionated, but it’s your friend. Google it.

Install the django app

Next, you’ll need to install this as an app in your django settings:

INSTALLED_APPS = [
    # ...
    'django_gcp'
    # ...
]

Add the endpoints

Tip

If you’re only using storage, and not events or tasks, you can skip this step.

Include the django-gcp URLS in your your_app/urls.py:

from django.urls import include, re_path
from django_gcp import urls as django_gcp_urls

urlpatterns = [
   # ...other routes
   # Use whatever regex you want:
   re_path(r"^django-gcp/", include(django_gcp_urls))
]

Using python manage.py show_urls you can now see the endpoints for both events and tasks appear in your app.

Authentication

There are two aspects to authentication with django-gcp: authenticating the server to interact with GCP, and authenticating incoming webhooks or messages from PubSub.

Authenticating the Server

Authenticating the server requires Service Account Credentials or Application Default Credentials.

Attention

At the time of writing, Google’s process for managing authentication in their SDKs is somewhat intractable, with difficult to navigate guidance and varying practices implemented and recommended across the platform. It is very easy to leak credentials as a result, so please take care.

However, there are some promising developments currently happening (like Workload Identify Federation and Service Account Impersonation) so we hope that soon it’ll be much easier to have a single workflow for this. In the meantime it’s worth following this guy.

A major issue in particular for storage is that we need the ability to sign files in GCS, which either requires a dedicated service account with the full key available (no longer recommended by google), or requires additional calls to a google-hosted API, significantly slowing any interaction requiring signed URLs.

If you’re not using media storage - only tasks/events and static (public) storage - this should not be an issue and you can use service account impersonation, federation or ADCs as appropriate.

Create a service account

In most cases, the default service accounts are not sufficient to read/write and sign files in GCS, so you will need to create a dedicated service account:

On GCP infrastructure
  • This library will attempt to read the credentials provided when running on google cloud infrastructure.

  • Ensure your service account is being used by the deployed GCR / GKE / GCE instance.

Warning

Default Google Compute Engine (GCE) Service accounts are unable to sign urls.

On GitHub Actions

You may need to use the library on infrastructure external to Google like Github Actions - for example running collectstatic within a GitHub Actions release flow.

You’ll want to avoid injecting a service account json file into your github actions if possible, so you should consider Workload Identity Federation which is made pretty easy by these glorious github actions.

Locally

We’re working on using service account impersonation, but it’s not fully available for all the SDKs yet, still a lot of teething problems (like this one, solved 6 days ago at the time of writing).

So you should totally try that (please submit a PR here to show the process if you get it to work!!). In the meantime…

  • Create the key and download your-project-XXXXX.json file.

Danger

It’s best not to store this in your project, to prevent accidentally committing it or building it into a docker image layer. Instead, bind monut it into docker images and devcontainers from somewhere else on your local system.

If you must keep within your project, it’s good practice to name the file gha-greds-<whatever>.json and make sure that gha-creds-* is in your .gitignore and .dockerignore files.

  • If you’re developing in a container (like a VSCode .devcontainer), mount the file into the container. You can make gcloud available too - check out this tutorial.

  • Set an environment variable of GOOGLE_APPLICATION_CREDENTIALS to the path of the json file.

Authenticating Webhooks and PubSub messages

Warning

We are yet to add the ability to accept _JWT-authenticated push subscriptions from PubSub, EventArc, Cloud Tasks or Cloud Scheduler so that authentication is handled out of the box.

In the meantime, it’s your responsibility to ensure that your handlers are protected (or otherwise wrap the urls in a decorator to manage authentication).

The best way of doing this is to generate a single use token and supply it as an event parameter (see :ref:`generating_endpoint_urls`_).

We want to work on this so if you’d like to sponsor that, find us on GitHub!

Events

This module provides a simple interface allowing django to absorb events, eg from Pub/Sub push subscriptions or EventArc.

Events are communicated using django’s signals framework. They can be handled by any app (not just django-gcp) simply by creating a signal receiver.

Warning

Please see Authenticating Webhooks and PubSub messages to learn about authenticating incoming messages

Events Endpoints

If you have django_gcp installed correctly (see Add the endpoints), using python manage.py show_urls will show the endpoints for events.

Endpoints are POST-only and require two URL parameters, an event_kind and an event_reference. The body of the POST request forms the event_payload.

So, if you POST data to https://your-server.com/django-gcp/events/my-kind/my-reference/ then a signal will be dispatched with event_kind="my-event" and event_reference="my-reference".

Creating A Receiver

This is how you attach your handler. In your-app/signals.py file, do:

import logging
from django.dispatch import receiver
from django_gcp.events.signals import event_received
from django_gcp.events.utils import decode_pubsub_message


logger = logging.getLogger(__name__)


@receiver(event_received)
def receive_event(sender, event_kind, event_reference, event_payload, event_parameters):
    """Handle question updates received via pubsub
    :param event_kind (str): A kind/variety allowing you to determine the handler to use (eg "something-update"). Required.
    :param event_reference (str): A reference value provided by the client allowing events to be sorted/filtered. Required.
    :param event_payload (dict, array): The event payload to process, already decoded.
    :param event_parameters (dict): Extra parameters passed to the endpoint using URL query parameters
    :return: None
    """
    # There could be many different event types, from your own or other apps, and
    # django-gcp itself (when we get going with some more advanced features)
    # so make sure you only act on the specific kind(s) you want to handle
    if event_kind is "something-important":
        # Here is where you handle the event using whatever logic you want
        # CAREFUL: See the tip above about authentication (verifying the payload is not malicious)
        print("DO SOMETHING IMPORTANT WITH THE PAYLOAD:", event_payload)
        #
        # Your payload can be from any arbitrary source, and is in the form of decoded json.
        # However, if the source is Eventarc or Pub/Sub, the payload contains a formatted message
        # with base64 encoded data; we provide a utility to further decode this into something sensible:
        message = decode_pubsub_message(event_payload)
        print("DECODED PUBSUB MESSAGE:" message)

Tip

To handle a range of events, use a uniform prefix for all their kinds, eg:

if event_kind.startswith("my-"):
    my_handler(event_kind, event_reference, event_payload)

Generating Endpoint URLs

A utility is provided to help generate URLs for the events endpoint. This is similar to, but easier than, generating URLs with django’s built-in reverse() function.

It generates absolute URLs by default, because integration with external systems is the most common use case.

import logging
from django_gcp.events.utils import get_event_url

logger = logging.getLogger(__name__)

get_event_url(
    'the-kind',
    'the-reference',
    event_parameters={"a":"parameter"},  # These get encoded as a querystring, and are decoded back to a dict by the events endpoint. Keep it short!
    url_namespace="gcp-events",  # You only need to edit this if you define your own urlpatterns with a different namespace
)

Tip

By default, get_event_url generates an absolute URL, using the configured settings.BASE_URL. To specify a different base url, you can pass it explicitly:

relative_url = get_event_url(
    'the-kind',
    'the-reference',
    base_url=''
)

non_default_base_url = get_event_url(
    'the-kind',
    'the-reference',
    base_url='https://somewhere.else.com'
)

Generating and Consuming Pub/Sub Messages

When hooked up to GCP Pub/Sub or eventarc, the event payload is in the form of a Pub/Sub message.

These messages have a specific format (see https://cloud.google.com/pubsub/docs/reference/rest/v1/PubsubMessage).

To allow you to interact directly with Pub/Sub (i.e. publish messages to a topic), or for the purposes of testing your signals, django-gcp includes a make_pubsub_message utility that provides an easy and pythonic way of constructing a Pub/Sub message.

For example, to test the signal receiver above with a replica of a real pubsub message payload, you might do:

from django_gcp.events.utils import make_pubsub_message
from datetime import datetime

class YourTests(TestCase):
    def test_your_code_handles_a_payload_from_pubsub(self):
        payload = make_pubsub_message({"my": "data"}, publish_time=datetime.now())

        response = self.client.post(
            reverse("gcp-events", args=["the-event-kind", "the-event-reference"]),
            data=payload,
            content_type="application/json",
        )

    self.assertEqual(response.status_code, 201)

Exception Handling

Any exception that gets raised in the handlers will be hidden from the user to prevent disclosure of information that may lead to attack.

Instead, a BAD_REQUEST (400) status code is returned with a generic error message.

Note

We’ll work on adding a way of returning more useful information to the end user, which will probably be based on raising a ValidationError or similar, a bit like using DRF serialisers.

However, this is low priority right now so as always, if you need this feature, ping us on GitHub!

Cloud Run

Metadata

The container contract for Google Cloud Run specifies an internal server for metadata about the running service. This is useful for:

  • determining if your app is running on Cloud Run or somewhere else,

  • fetching values like the project_id which are required for structured logging

  • generating tokens that can be used to sign blobs without a private key.

To avoid the need to write requests to the internal server yet again, django_gcp provides a wrapper class for those query, exposing the results as properties. See django_gcp.metadata.metadata.CloudRunMetadata.

from django_gcp.metadata import CloudRunMetadata

meta = CloudRunMetadata()

# On your local machine, `meta.is_cloud_run` will be False, and accessing these
# attributes will raise a NotOnCloudRunError
if meta.is_cloud_run:
    print(meta.project_id)
    print(meta.project_number)
    print(meta.region)
    print(meta.compute_instance_id)
    print(meta.email)
    print(meta.token)

Attention

The log handlers included here work great but we suspect some improvements could be made to the structure of the logs to give fuller / more easily filterable results, especially around trace/span and the contents of the httpRequest object. Pick up the issue here: PRs are welcome!

Logging

Tip

Quickly set up logging out of the box, by dropping the LOGGING entry from the example test server into your settings.py.

Structured logs

On Google Cloud, if you use structured logging, your entries can be filtered and inspected much more powerfully than if you log in plain text.

Django has its own default logging configuration, and we need to do some tweaking to it to make sure we capture the information in a structured way. Notice particularly that the django and django.server modules have specific setups to record, for example, request-level information.

django-gcp provides django_gcp.logging.GoogleStructuredLogsHandler to add django-specific behaviour to the Google StructuredLogsHandler that is used under the hood.

Error Reporting

This isn’t the same thing as structured logging.

If you use Google Cloud Error Reporting (as opposed to sentry or similar), django-gcp provides a handler enabling you to send errors/exceptions directly from django. Then you can configure Error Reporting as you wish (eg to track unresolved errors, email teams, connect issue trackers, etc).

django-gcp provides django_gcp.logging.GoogleErrorReportingHandler to do this. You need to set the GCP_ERROR_REPORTING_SERVICE_NAME value in your settings.py.

Storage

This module provides helpers for working with Google Cloud Storage, including:

  1. A django Storage class allowing django’s FileField to use GCS as a storage backend. This incorporates the GCS-specific parts of django-storages.

  2. A BlobField with an associated widget to facilitate direct uploads and provide more powerful ways of working with GCS features including metadata and revisions.

Direct upload widget

The widget provides a better user experience for blankable and overwriting options.

Installation and Authentication

First, follow the instructions to install, authenticate and (if necessary) set your project.

Create bucket(s)

This library doesn’t create buckets for you: infrastructure operations should be kept separate and dealt with using tools built for the purpose, like terraform or Deployment Manager.

If you’re setting up for the first time and don’t want to get into that kind of infrastructure-as-code stuff, then manually create two buckets in your project:

  • One with object-level permissions for media files.

  • One with uniform, public permissions for static files.

Tip

Having two buckets like this means it’s easier to configure which files are public and which aren’t. Plus, you can serve your static files much more efficiently - publicly shared files are cached in google’s cloud CDN, so they’re lightning quick for users to download, and egress costs you amost nothing.

Tip

To make it easy and consistent to set up (and remember which is which!), we always use kebab case for our bucket names in the form:

<app>-<purpose>-<environment>-<media-or-static>

The buckets for a staging environment in one of our apps look like this:

Buckets configuration

Setup Media and Static Storage

The most common types of storage are for media and static files, using the storage backend. We derived a custom storage type for each, making it easier to name them.

In your settings.py file, do:

# Set the default storage (for media files)
DEFAULT_FILE_STORAGE = "django_gcp.storage.GoogleCloudMediaStorage"
GCP_STORAGE_MEDIA = {
    "bucket_name": "app-assets-environment-media" # Or whatever name you chose
}

# Set the static file storage
#   This allows `manage.py collectstatic` to automatically upload your static files
STATICFILES_STORAGE = "django_gcp.storage.GoogleCloudStaticStorage"
GCP_STORAGE_STATIC = {
  "bucket_name": "app-assets-environment-static" # or whatever name you chose
}

# Point the urls to the store locations
#   You could customise the base URLs later with your own cdn, eg https://static.you.com
#   But that's only if you feel like being ultra fancy
MEDIA_URL = f"https://storage.googleapis.com/{GCP_STORAGE_MEDIA_NAME}/"
MEDIA_ROOT = "/media/"
STATIC_URL = f"https://storage.googleapis.com/{GCP_STORAGE_STATIC_NAME}/"
STATIC_ROOT = "/static/"

Default and Extra stores

Any number of extra stores can be added, each corresponding to a different bucket in GCS.

You’ll need to give each one a “storage key” to identify it. In your settings.py, include extra stores as:

GCP_STORAGE_EXTRA_STORES = {
    "my_fun_store_key": {
        "bucket_name": "all-the-fun-datafiles"
    },
    "my_sad_store_key": {
        "bucket_name": "all-the-sad-datafiles"
    }
}

BlobField Storage

The benefit of a BlobField is that you can do direct upload of objects to the cloud.

This allows you to accept uploads of files > 32mb whilst on request-size-limited services like Cloud Run.

To enable this and other advanced features (like caching of metadata and blob version tracking), BlobField``s intentionally don't maintain the ``FileField api. Under the hood, a BlobField is actually a JSONField allowing properties other than just the blob name to be stored in the database.

We’ll flesh out these instructions later (or Pull requests accepted!) but in the meantime, see the example implementation here.

You’ll need to:

  1. Add a django_gcp.storage.fields.BlobField field to a model.

  2. Define a get_destination_path callback to generate the eventual name of the blob in the store.

Tip

On upload, blobs are always ingressed to a temporary location then moved to their eventual destination on save of the model. Two steps (ingress -> rename) seems unnecessary, but this allows the eventual destination to use the other model fields. It also avoids problems where you require deterministic object names: where object versioning or retention is enabled on your bucket, an unrelated failure in the model save() process will prevent future uploads to the same pathname.

Warning

Migrating from an existing FileField to a BlobField is possible but a bit tricky. We provide an example of how to do that migration in the example server model (see the instructions in the model, and the corresponding migration files)

FileField Storage

Works as a standard drop-in storage backend.

Standard file access options are available, and work as expected

>>> default_storage.exists('storage_test')
False
>>> file = default_storage.open('storage_test', 'w')
>>> file.write('storage contents')
>>> file.close()

>>> default_storage.exists('storage_test')
True
>>> file = default_storage.open('storage_test', 'r')
>>> file.read()
'storage contents'
>>> file.close()

>>> default_storage.delete('storage_test')
>>> default_storage.exists('storage_test')
False

Storage Settings Options

Each store can be set up with different options, passed within the dict given to GCP_STORAGE_MEDIA, GCP_STORAGE_STATIC or within the dicts given to GCP_STORAGE_EXTRA_STORES.

For example, to set the media storage up so that files go to a different location than the root of the bucket, you’d use:

GCP_STORAGE_MEDIA = {
    "bucket_name": "app-assets-environment-media"
    "location": "not/the/bucket/root/",
    # ... and whatever other options you want
}

The full range of options (and their defaults, which apply to all stores) is as follows:

gzip

Type: boolean

Default: False

Whether or not to enable gzipping of content types specified by GZIP_CONTENT_TYPES

gzip_content_types

Type: tuple

Default: (text/css, text/javascript, application/javascript, application/x-javascript, image/svg+xml)

Content types which will be gzipped when GCP_STORAGE_IS_GZIPPED is True

default_acl

Type: string or None

Default: None

ACL used when creating a new blob, from the list of predefined ACLs. (A “JSON API” ACL is preferred but an “XML API/gsutil” ACL will be translated.)

For most cases, the blob will need to be set to the publicRead ACL in order for the file to be viewed. If GCP_STORAGE_DEFAULT_ACL is not set, the blob will have the default permissions set by the bucket.

publicRead files will return a public, non-expiring url. All other files return a signed (expiring) url.

ACL Options are: projectPrivate, bucketOwnerRead, bucketOwnerFullControl, private, authenticatedRead, publicRead, publicReadWrite

Note

GCP_STORAGE_DEFAULT_ACL must be set to ‘publicRead’ to return a public url. Even if you set the bucket to public or set the file permissions directly in GCS to public.

Note

When using this setting, make sure you have fine-grained access control enabled on your bucket, as opposed to Uniform access control, or else, file uploads will return with HTTP 400. If you already have a bucket with Uniform access control set to public read, please keep GCP_STORAGE_DEFAULT_ACL to None and set GCP_STORAGE_QUERYSTRING_AUTH to False.

querystring_auth

Type: boolean Default: True

If set to False it forces the url not to be signed. This setting is useful if you need to have a bucket configured with Uniform access control configured with public read. In that case you should force the flag GCP_STORAGE_QUERYSTRING_AUTH = False and GCP_STORAGE_DEFAULT_ACL = None

file_overwrite

Type: boolean Default: True

By default files with the same name will overwrite each other. Set this to False to have extra characters appended.

max_memory_size

Type: integer Default: 0 (do not roll over)

The maximum amount of memory a returned file can take up (in bytes) before being rolled over into a temporary file on disk. Default is 0: Do not roll over.

blob_chunk_size

Type: integer or None Default None

The size of blob chunks that are sent via resumable upload. If this is not set then the generated request must fit in memory. Recommended if you are going to be uploading large files.

Note

This must be a multiple of 256K (1024 * 256)

object_parameters

Type: dict Default: {}

Dictionary of key-value pairs mapping from blob property name to value.

Use this to set parameters on all objects. To set these on a per-object basis, subclass the backend and override GoogleCloudStorage.get_object_parameters.

The valid property names are

acl
cache_control
content_disposition
content_encoding
content_language
content_type
metadata
storage_class

If not set, the content_type property will be guessed.

If set, acl overrides GCP_STORAGE_DEFAULT_ACL.

Warning

Do not set name. This is set automatically based on the filename.

custom_endpoint

Type: string or None Default: None

Sets a custom endpoint, that will be used instead of https://storage.googleapis.com when generating URLs for files.

location

Type: string Default: ""

Subdirectory in which the files will be stored. Defaults to the root of the bucket.

expiration

Type: datetime.timedelta datetime.datetime, integer (seconds since epoch) Default: timedelta(seconds=86400)

The time that a generated URL is valid before expiration. The default is 1 day. Public files will return a url that does not expire. Files will be signed by the credentials provided during authentication.

The GCP_STORAGE_EXPIRATION value is handled by the underlying Google library. It supports timedelta, datetime, or integer seconds since epoch time.

Tasks

In django, tasks are used to handle processing work that happens outside of the main request-response cycle.

django-gcp allows tasks to be processed in a serverless environment like cloud run, triggered by managed services like Cloud Tasks, Cloud Scheduler or Pub/Sub topics.

About tasks in django

Tasks in django include, for example, dispatching jobs whose execution is too long to occur within a request (anything more than a few hundred milliseconds should probably be offloaded), running scheduled maintenance tasks (like refreshing a cache), or processing data that doesn’t need to be done within a request loop.

The classic example is sending email to a user responding to a registration request: a task requiring interaction with a third party API, making the request slow.

Existing solutions

Historically, to manage the queue of tasks django has required the use of libraries like celery (which is very tricky to set up correctly) or django-dramatiq (a much cleaner API than Celery, still a great option today) with an external message handler/store like REDIS.

However, managing these queues requires the dev team to think about exactly-once delivery, retries and throttling. A redis or rabbitmq instance must be created and managed. To invoke tasks periodically, a cron job is required (meaning yet another working part somewhere in the devops maze). Finally, these systems operate on a pull-based model, meaning that you constantly have to have workers alive, listening to the queue.

All that makes it difficult to run django in a serverless environment like Cloud Run. Plus, where tasks are only intermittent, it wastes a lot of money having workers up all the time.

Why django-gcp for tasks?

django-gcp uses a push-based model, meaning that workers can be serverless: autoscaled from zero in response to task requests.

It uses Google’s managed services, Cloud Tasks and Cloud Scheduler enabling very quick and easy configuration of robust task queues and periodic triggers.

Creating and using tasks

TODO: I’ve written SO MUCH already and need to get this into production. This week.

I’ll come back and explain this, I promise.

~~ Tom ~~

IN THE MEANTIME:

Look at management commands available (both in django gcp and the example app), and look at the full example implementation here to pick up how to define and use tasks :)

If you need to use this library and can’t figure it out, get in touch by raising an issue on GitHub and we’ll help you configure your app (and write the docs at the same time).

Deduplicating tasks

OnDemandTask classes with the attribute deduplicate = true have the special property that the task cannot be repeated.

Duplication is done using both the task name AND a short_sha of the payload data. That is:

  • You can enqueue the same task twice in succession with different payloads.

  • If you enqueue the same task with the same payload twice in quick succession, you will get a DuplicateTaskError.

  • A duplicate task will fail for ~1 hour after it is either executed or deleted from the queue.

Tip

Deduplicating tasks introduces significant additional latency into the task queue. So don’t do it unless you have to!

Note

GCP requires a task ID to deduplicate tasks, whose string ordering should be optimally binomaially distributed. django-gcp always prefixes the short_sha of the payload to ensure that the created task IDs are approximately binomially distributed (as opposed to using the task name as a prefix, which would give a highly non-optimal distribution in N clusters, where N is the number of differently named tasks).

Tasks Settings

There are a number of settings required to enable On-Demand and Scheduled tasks to work, we recommend you go through the following one-by-one…

In order of importance!

GCP_TASKS_DEFAULT_QUEUE_NAME

Type: string (required)

The name of the task queue on GCP used for on-demand tasks. This will be created (if not already present) when you enqueue your first task.

GCP_TASKS_DOMAIN

Type: string (required)

The base url of the server to which tasks will be pushed. In production, this’ll need to be set to the URL of your worker service (see Deploying Workers).

Tip

In local development, set up localtunnel and use its -s option to set yourself an amusing subdomain. You can then set GCP_TASKS_DOMAIN = "https://king-julian-in-da-house.loca.lt" in your local environment, and receive https:// traffic.

That’s awesome, because (assuming you’ve installed local credentials per Locally it allows you to spin up actual real queues and schedules on GCP to get a feel for how this all works.

GCP_TASKS_RESOURCE_AFFIX

Type: string

Default: None

This is a label which is affixed to the names of all resources created by django_gcp. It’s HIGHLY RECOMMENDED that you set this to avoid confusion about what resources belong to what applications, and to enable cleanup of old resources.

If left unset, there’ll be no affix applied. This might be exactly what you want, for example if you manage all your task queues and scheduler jobs on existing infrastructure or using terraform, your own naming convention may already apply.

Warning

Without setting GCP_TASKS_RESOURCE_AFFIX, django-gcp won’t be able to clean up after itself so you’ll have to remove any resources manually yourself.

Make sure you don’t have multiple independent django apps with the same affix, or one app may delete resources for another.

Note

SubscriberTask subclasses that listen to a PubSub topic don’t automatically add a prefix to the topic name they listen to. This allows you to subscribe to any topic on GCP for triggering tasks; if you want to use the prefix you can do so when defining the topic_name override

GCP_TASKS_REGION

Type: string

Default: “europe-west1”

The region in which resources (Task Queues, Scheduler Jobs, and PubSub topics) are accessed and/or created.

GCP_TASKS_DELIMITER

Type: string

Default: “–”

The delimiter used when creating resource names with an affix or other identifier.

GCP_TASKS_EAGER_EXECUTE

Type: boolean

Default: False

If set to true, tasks will synchronously execute when their enqueue() method ``is called (eg within a request).

Whilst not generally useful in production, this can be quite helpful for straightforward debugging of tasks in local environments.

GCP_TASKS_DISABLE_EXECUTE

Type: boolean

Default: False

If set to true, tasks will not be enqueued for processing when their enqueue() or enqueue_later() methods are called. Instead, the method will simply return None without enqueuing the task, allowing for the disabling of task execution. This can be useful in scenarios where tasks need to be temporarily disabled or when testing/debugging task code.

It is important to note that this setting only affect tasks when their enqueue() or enqueue_later() methods are called and that tasks can still be executed manually even if this setting is set to True.

Task Workers

A worker is a server instance, running the django application, whose sole job it is to do the tasks that get placed onto the queue (or pushed via scheduler or PubSub).

Deploying Workers

In the most straightforward usage, you don’t even need a separate worker. To get up and running minimally, you can point GCP_TASKS_DOMAIN straight back to the app itself! Check the docs of that setting for more tips on local development.

However, in most cases you’ll want the server to scale independently of the worker service and that’s not hard to achieve.

  • Begin using exactly the same configuration and deployment process that you use for the main server (eg deploy to Cloud Run, but use a -worker suffix in the app name).

  • Get the release-specific URL to that deployment.

  • Set that URL as the GCP_TASKS_DOMAIN value on the server.

Tip

Using Cloud Run, you can provide a tag to create a revision-specific URL as part of the worker deployment process. If you deploy worker and server at the same time, and configure the server with the revision-specific URL, the server will always send tasks to the same version of code that it’s running on itself. This is great for maintaining continuous uptime without worrying about breaking changes in the data required by your tasks.

On GitHub Actions, that looks something like:

# ... build an image, then ...

- name: Deploy to Cloud Run Worker
  id: deploy_worker
  uses: google-github-actions/deploy-cloudrun@v0
  with:
    service: yourapp-worker-${{ needs.build.outputs.environment }}
    image: ${{ needs.build.outputs.image_version_artefact }}
    region: europe-west1
    tag: ${{ needs.build.outputs.short_sha }}

- name: Deploy to Cloud Run Server
  id: deploy_server
  uses: google-github-actions/deploy-cloudrun@v0
  with:
    env_vars: |
      GCP_TASKS_DOMAIN=${{ steps.deploy_worker.outputs.url }}
    image: ${{ needs.build.outputs.image_version_artefact }}
    region: europe-west1
    service: yourapp-server-${{ needs.build.outputs.environment }}
    tag: ${{ needs.build.outputs.short_sha }}
Microservices as Workers

There’s nothing special or django-gcp specific about the data passed to tasks. So, there’s absolutely no reason why you shouldn’t use entirely separate microservices to receive and process tasks created by django-gcp!

Enjoy yourself, and let us know what you build :)

Projects

In most cases, the id of the GCP project you’re working on will be inferred from your Application Default Credentials or Service Account (see Authentication).

If that’s not correct (eg your service account has privileges across projects), you may need to set it explicitly.

Settings

GCP_PROJECT_ID (optional)

Your Google Cloud project ID. If unset, falls back to the default inferred from the environment.

API

Events

Logging

Metadata

Storage

Tasks

License

The Boring Bit

See the django-gcp license.

Third Party Libraries

django-gcp includes or is linked against code from third party libraries, see our attributions page.

Version History

We used to recommend people create version histories. But we now do it automatically using our conventional commits tools for completely automating code versions, release numbering and release history.

So for a full version history, check our releases page.

Thanks

This project is heavily based on a couple of really great libraries, particularly django-storages and django-cloud-tasks. See our attributions page.

Thank you so much to the (many) authors of these libraries :)

Also, this library boilerplate is from the django-rabid-armadillo project…

Unhappy armadillo