Tag: Python

Conda cheat sheet

Create new environment:

conda create --name sandbox python=3.5

Create the environment from the environment.yml file:

conda env create -f environment.yml

Activate specific environment:

source activate sandbox

Export your active environment to a new file:

conda env export > environment.yml

Remove specific environment:

conda env remove --name sandbox


Python3 strings

A bit of background on unicode and UTF-8:

Unicode has a different way of thinking about characters. In Unicode, the letter “A“ is a platonic ideal. It’s just floating in “heaven”. Every platonic letter in every alphabet is assigned a magic number by the Unicode consortium which is written like this: U+0639 (in python “\u0639“).

UTF-8 is a system of storing your string of unicode code points (those magic “U+number“) in memory using 8 bit bytes.

One of the common questions for python 3 is when to use bytestring and when to use strings as an object? When you are manipulating string (e.g. “reversed(my_string)“) you always use string object and newer bytestring. Why? Here is an example:


my_string = "I owe you £100"
my_bytestring = my_string.encode()

>>> print(''.join([c for c in reversed(my_string)]))
001£ uoy ewo I
>>> print(''.join([chr(c) for c in reversed(my_bytestring)]))
001£Â uoy ewo I

You should never call encode without specifying which encoding to use because then the interpreter will pick for you which will “almost” always be UTF-8 but there are some instances where this won’t be so and you will spent a lot of time finding this bug. So ALWAYS specify which encoding to use (e.g. “.encode(‘utf-8’)“). Example:


>>> print('I owe you £100'.encode('utf-8').decode('latin-1'))
I owe you £100

Full article can be found here: link

Python PDB

Last month I attended PyMunich conference and in this blog post I want to highlight a talk by Philip Bauer (“Debug like a pro. How to become a better programmer through pdb-driven development“).

Basic commands for pdb that Philip highlighted:

  •  l[ist] (list source code of current file)
  • n[ext] (continue execution until next line)
  • s[tep] (execute the current line, stop at the first possible occasion)
  • r[eturn] (continue execution until the current function returns)
  • c[ontinue] (continue execution, only stop when a breakpoint is encountered)
  • w[here] (show stack trace, recent frame at bottom)
  • u[p] (move up the stack)
  • d[own] (move down the stack)
  • b[reakpoint] (set a new breakpoint. `tbreak` for temporary break points)
  • a[rgs] (print the argument list of the current function)

`ipdb`/`pdbpp` also have long list method (`ll`) which displays the whole function you are in.

Other python debugging tricks you should know about are:

  • use ?for getting additional information lib/class/function/... (e.g. os?)
  • use ??for displaying the source code of the lib/class/function you want to inspect (e.g. os.path.join??)
  • pp(Pretty-print) is already in pdb so you should always use it
  • pp locals()will pretty print local variables

One of the best tricks is the `help` function which accepts object and returns generated help page for the object. !help(obj.__class__)command will generate help page which will contain all the methods including class methods and static methods with docstrings, method resolution order, data descriptors, attributes, data and other attributes inherited and much more.

Full article about PyMunich 2016 is located here: link.

Releasing Python Packages

When you need to fix a bug, add feature or change existing functionality in a library that you are using there are several ways to do this. Here I will show you how to do this properly:

  1. Fork repository.
  2. Clone into your local dir.
  3. Create new branch and add your code (do not forget to add tests!).
  4. Commit & push, create PR on your forked repo, merge.
  5. Create PR to the base project so this change can go upstream (and then the whole community benefits from it).
  6. Release your updated version of this lib.

Step 3 can be a bit tricky to do because often you need to test this new code in your existing project. My favourite way to test if your new code works is to install this lib (package) in "develop mode":

pip install -e /src/my_modified_lib

Step 6 is needed if you have continuous integration like TravisCI. For this you also need your own private pypi server or have an account on official python pypi server. To create a release package that you can upload on pypi server just type:

python setup.py sdist --formats=zip

When releasing package, MANIFEST.in file is very important so if your files are missing in the .zip package, the problem is with MANIFEST.in file.

Then just upload this package (located in `dist` folder) to pypi server. Next thing you will need to do is to tell your buildout process where this new package is located. This is unfortunately very different based on which buildout tool you use (e.g. if you are using buildout tool you need to add link to this package to the "find-links" list).

How to mock ‘open’ built-in function

Here is a code snippet on how to test open built-in function:

Code to test:

def my_function(self, id: int) -> None:
    file_path = 'path/to/files/{}.txt'.format(id)
    with open(file_path, 'w') as f:
        f.write('< My data >')


Unit test:

def test_my_function(self):
    _open = mock.mock_open()	
    with mock.patch('my_project.scripts.my_script.open', _open, create=True): # noqa
    _open.assert_called_with('path/to/files/123.txt', 'w')
    fp = _open.return_value.__enter__.return_value
    fp.write.assert_called_with('< My data >')


Happy unit testing.


How to mock __import__ built-in function

So lets say we need to write unit test for piece of code that uses  __import__  built-in function:

def task_we_want_to_test(task_name: str, task_class: str, env: dict) -> Task:
    module = __import__(
        fromlist=[task_class, ])
    klass = getattr(module, task_class)
    return klass(env)


You can test this code like this:

def test_get_task(self):
    from my_project.scripts import task_we_want_to_test

    MyScript = mock.MagicMock()
    _import = mock.MagicMock()
    _import.MyScript = MyScript
    with mock.patch.dict(
        {'my_project.scripts.my_script': _import},
        klass = task_we_want_to_test('my_script', self.env)
        self.assertEqual(klass, MyScript(self.env))


Lessons learned from EuroPython 2016

This was my first EuroPython conference and I had high expectations because I heard a lot of good things about it. I must say that overall it didn’t let me down. I learned several new things and met a lot of new people. So lets dive straight into the most important lessons.

On Tuesday I attended “Effective Python for High-Performance Parallel Computing” training session by Michael McKerns. This was by far my favorite training session and I have learned a lot from it. Before Michael started with code examples and code analysis he emphasized two things:

  1. Do not assume what you hear/read/think. Time it and measure it.
  2. Stupid code is fast! Intelligent code is slow!

At this point I knew that the session is going to be amazing. He gave us a github link (https://github.com/mmckerns/tuthpc) where all examples with profiler results were located. He stressed out that we shouldn’t believe him and that we should test them ourselves (lesson #1).

I strongly suggest to clone his github repo (https://github.com/mmckerns/tuthpc) and test those examples yourself. Here are my quick notes (TL; DR):

  • always compile regular expressions
  • use local variables (true = True, local = GLOBAL)
  • if you know how many elements it will be in your list, create it with None elements and then fill it (L = [None] * N)
  • when inserting item on 0 index in a list use append then reverse (O(n) vs O(1))
  • use built-in functions, use built-in functions, use built-in functions!!! (they are written in C layer)
  • when extending list use .extend() and not +
  • searching in set (hash map) is a lot faster then searching in list (O(1) vs O(n))
  • constructing set is much slower then list so you usually don’t want to transform list into set and then search in it because it will be slower. But again you should test it
  • += doesn’t create new instance of an object so use this in loops
  • list comprehension is better than generator. for loop is better then generator and sometimes also than list comprehension (you should test it!)
  • importing is expensive (e.g. numpy is 0.1 sec)
  • switching between python arrays and numpy arrays is very expensive
  • if you start writing intelligente and complex code you should stop and rethink if there is more stupid way of achieving your goal (see lesson #2)
  • optimize the code you want to run in parallel. This is more important than to just run it in parallel.

Here is a full blog post that I have written for NiteoWeb.

More Python mocking: dictionary

In previous post we look at how basic mocking works and some examples. Today we will see how you can mock dict :

post = {
    "name": "test",
    "content": "tist",
data = mock.MagicMock()
data.__getitem__.side_effect = lambda key: post[key]

If we copy paste this into interpreter we can check the results:

>>> data["name"]
>>> 'test'

Happy testing ...


Unit testing in Python with mock

Before we start writing tests let's make sure that we understand why do we want to write unit tests and the concept of unit testing. Here are a few reasons (from my experience) why is a good idea to write tests:

  • In more complex projects you can't (or it's very hard) simulate the error that you have found in your error.log. So if you write unit test that will simulate that error you can then fix your code and when test pass you know you have fix the bug.
  • It is very hard to check if some functionality is working when it gets its input from some other service that isn't build yet (or you only know how it behaves). You can write unit test that will simulate (mock) that service.
  • On the long run every test you write will save you time. FACT!

For conceptual part I will use this quote:

As a developer, you care more that your library successfully called the system function for ejecting a CD as opposed to experiencing your CD tray open every time a test is run.


Ok lets look at some examples. First we will start with some really simple ones (that I have "stolen" from toptal):

This is our code that we want to test:

# -*- coding: utf-8 -*-

import os

def rm(filename):

With mock lib we can easily test our code like this:

# -*- coding: utf-8 -*-

from mymodule import rm

import mock
import unittest

class RmTestCase(unittest.TestCase):
    def test_rm(self, mock_os):
        rm("any path")
        # test that rm called os.remove with the right parameters
        mock_os.remove.assert_called_with("any path")

Note: we test only if os.remove is successfully called with correct arguments.

Let see how we can test Pyramid/Django view function:

def generate_deeplink(request):
    if request.POST:
            booking = ExternalAPI(
            return dict(booking=booking)
        except AdministrationException as e:
                'AdministrationException: {}'.format(e))

    return dict()

And the test for this piece of code can be something like this:

    <ReservationRS status="PENDING" test="tist">

def test_generate_deeplink(self, external_API_rq):
    req = testing.DummyRequest(post={
        'reserv_number': '12345678 ',
        'email': 'john.doe@example.com',
    external_API_rq().get.return_value = BOOKING_INFO_RS

    result = views.generate_deeplink(req)
    self.assertEqual(result['booking'], BOOKING_INFO_RS)

So before we call views.generate_deeplink(req) we create dummy request and mock ExternalAPI and we set  ExternalAPI.get() function return value. After we call our view we start checking if ExternalAPI was called with the correct params and if the result is what we expect.


Here is a bit more complex example that is also using very useful freezegun lib.

@view_config(route_name='send_deeplink', permission="view")
def send_deeplink(request):
    _ = request.localizer.translate
    reserv_number = request.matchdict['reserv_number']
    email = request.matchdict['email']
    booking = ExternalAPI(reserv_number, email).get()

    json_querystring = ExternalAPI_checking_something(

    deeplink = request.route_url(

    email_template = Template(
        filename=request.registry.get("BASE") +

    body = email_template.render(

    author = request.registry.get("mail_default_sender")
    subject = _("SUBJECT")

    message = Message(

    request.registry['mailer'].send_immediately(message, fail_silently=False)

    request.session.flash('Deeplink successfully sent.')
    return HTTPFound(location=request.route_url('generate_deeplink'))

And our unit test:

# -*- coding: utf-8 -*-
from freezegun import freeze_time
from pyramid import testing
from pyramid.httpexceptions import HTTPFound
from myProject import views

import mock
import os
import unittest

    'data1': {
        'id': '15',
        'time': '12:00',
    'data2': {
        'id': '12',
        'time': '12:00',
    'data3': '29',
QUERY_STRING = 'param_1%3Dvalue_1%2Bparam_2%26param_3%3Dvalue_3%26'
BASE_PATH = os.path.abspath(os.path.join(os.pardir, os.pardir))
DEEPLINK = 'this is mocked deeplink'

@freeze_time("2015-01-14 12:00:01")
class TestDeeplink(unittest.TestCase):

    def setUp(self):
        self.config = testing.setUp()

    def tearDown(self):

    def test_generate_deeplink(
        req = testing.DummyRequest()
        req.localizer.locale_name = 'en'
        req.registry = {
            'BASE': BASE_PATH,
            'mail_default_sender': 'admin@localhost.com',
            'mailer': mock.MagicMock()
        req.registry['mailer'].send_immediately = mock.MagicMock()
        req.matchdict = {
            'email': 'john.doe@example.com',
            'reserv_number': '12345678',
        external_API_2_rq().query.return_value = QUERY_STRING
        external_API_rq().get.return_value = BOOKING_INFO_DICT
        req.route_url = mock.MagicMock()
        req.route_url.return_value = DEEPLINK
        mock_Template().render.return_value = 'rendered html email'
        req.session.flash = mock.MagicMock()
        mock_Message.return_value = 'awesome email'

        result = views.send_deeplink(req)


        # because `request.route_url` is called 2 times we must check if it
        # was called in the right order and with the correct params

        # checking if `Template` was called with the correct params

        # checking if `Template.render()` was called with the correct params
            date='14.01.2015',  # time that we have set with `@freeze_time`

        # ...
            html='rendered html email',

        # ...
            'awesome email',

        # ...
            'Deeplink successfully sent.')

        # checking if the return result is redirect - HTTPFound instance
        self.assertIsInstance(result, HTTPFound)


First we mock variables and functions and then we set return values for these functions (e.g. mock_Template().render.return_value = 'rendered html email' ). After the stage is set we call our view ( result = views.send_deeplink(req) ) and then we start checking what was called ( e.g. ...assert_called_with(...) ).

One thing to remember is to always set your mock functions and return values BEFORE you call your view and check what was called AFTER you call your view ( result = views.send_deeplink(req) ).