Working with Urls in Python & Django

First the basics — navigating within your own app. If you’ve got this in your urls.py file:

from django.conf.urls import url
from paths import views

urlpatterns = (
    url(r'^some/path/here$', views.path_view, name='path_view'),
)

It would work to write something like return redirect('some/path/here'), but don’t. If you change your mind later and decide you want to pluralize the path to 'some/paths/here, you’d have to go find every place you’d referenced that path in your codebase and change it. So instead, you can write return redirect(reverse('path_view')) and change its reference just once in urls.py. (You’ll need to import from django.core.urlresolvers import reverse).

If your urls include something like this:

urlpatterns = (
    url(r'(?P<object_id>[0-9]+)/here/$', views.path_view, name='path_view'),
)

You can pass the dynamic piece(s) — in this case, the id, into reverse as kwargs: return redirect(reverse('path_view', kwargs={'object_id': '1234'}))

Let’s say we’ve got multiple apps, and some of them are nested within others. You might have something like this in your root urls.py:

from django.conf.urls import url, include

from root import views

urlpatterns = (
    url(r'^$', views.home, name='home'),
    url(r'^nested/', include('root.nested.urls', namespace='nested')),
)

And then within the urls.py file of your nested app, you might have:

from django.conf.urls import url

from nested import views

urlpatterns = (
    url(r'^/path/here+)$', views.path_vie, name='path_view'),
)

This view can be accessed by calling reverse('root:nested'). You can go as many levels deep as you need to on this.

This is where things get interesting. You can break a URL into its pieces, or compose a new one out of disparate pieces, and reform them how you want. Here are the docs.

I use this most often in tests where my view expects the request url to have query parameters, but there are plenty of other uses for it too — you can build a single url from pieces from as many different sources as you like, including self-composed strings.

from urlparse import urlunparse

url = urlunparse((None, None, reverse('url_name'), None,
                  'queryparam={}'.format(some_variable), None))
url #=> u'/urlpath?queryparam=variablevalue'

Note that urlunparse expects a tuple of 6 elements, so if you don’t want anything for any of the pieces, pass in None.

Likewise, you may need to do something with specific pieces of a url, but not the whole thing. Enter URL parsing.

parsed_url = urlparse('https://www.example.com/path?query=value&another=value#fragment')
#=> ParseResult(scheme='https', netloc='www.example.com', path='/path', params='', query='query=value&another=value', fragment='fragment')

With that, you can access each piece independently, like so:

parsed_url.scheme #=> 'https'
parsed_url.query  #=> 'query=value&another=value'

etc.

Sometimes you have a url, but you need to add additional query params to it from another source. While this could be done using string parsing, it’s safer to use these built in url libraries, as it handles the magic for you. Pulling in some pieces from above, you might do something like the following:

from django.http import QueryDict
from urlparse import urlparse, urlunparse

parsed_url = urlparse('https://www.example.com/path?1=a&2=b#fragment')
queries = QueryDict(parsed_url.query, mutable=True)
queries.update({'1': 'c', '3': 'd'})
queries                       #=> <QueryDict: {u'1': [u'a', 'c'], u'2': ['b'], u'3': [u'd']}>
queries = queries.urlencode() #=> u'1=a&1=c&2=b&3=d'
new_url = urlunparse((parsed_url.scheme, parsed_url.netloc, parsed_url.path, parsed_url.params, queries, parsed_url.fragment))
new_url                       #=> u'https://www.example.com/path?1=a&1=c&2=b&3=d#fragment'

A few things to note:

  • Unless you explicitly make your QueryDict mutable, you won’t be able to update it
  • If you want to replace queries with your newly formed QueryDict, you’ll need to call .urlencode() on it first to get a string back out
  • If you add a duplicate key to your QueryDict, it won’t overwrite the previous value — your new query string will have 2 keys by the same name with different values
  • The values of your QueryDict are in a list form, not a string. There are some external libraries (I’m looking at you, allauth) that do some funky things with params, including updating a regular dictionary with a QueryDict and then calling urlencode on it. This will encode the brackets of your list as well and give you something like '1=a&2=%5Bu%27b%27%5D'. This is probably not what you want.

Enjoy!