Tag: Lessons learned

Tips on reading the systemd journal logs

When looking at logs, you should almost always add the --utc flag or check server's local
time (timedatectl status). It should always be UTC (timedatectl set-timezone UTC).
Why? Because when you look at logs, you often compare them with other logs/events. E.g.
Sentry issues, papertrail logs, other systems, other servers, ...

Let's talk about the most frequent journalctl commands.

Usually, you don't want to see all the logs. You want to see logs from a specific unit
(e.g., HAProxy, MySQL...). You also want to specify the time range. For this, we need to
add 2 flags. --unit (-u for short) and
--since (-S for short).

Generally, I like to use full flag names because IMO it's a good practice to do so. Why?
You learn them faster, and you will remember them longer because they are words. This is
the reason why we invented domains.
You also often copy-paste those commands to your co-workers, to your documentation, etc.
And if a developer isn't familiar with those flags, they will need to check the docs.
If we use full flag names, this is often unnecessary because full flag names are more
descriptive.

$ journalctl --unit haproxy --since "2 days ago"
$ journalctl --unit haproxy --since "2 hours ago"
$ journalctl --unit haproxy --since "2018-06-26"
$ journalctl --unit haproxy --since "2018-06-26 23:00"
$ journalctl --unit haproxy --since "2018-06-26" --until "2018-06-27"

You can also select multiple units.

$ journalctl --unit haproxy --unit mysql

Sometimes you want to search through logs. You have 2 options. You can use the --grep
flag or you can pipe the journalctl output to grep.

I usually prefer to pipe the output because of 2 things. 1. I can do things like count
the results (grep --count ...). Also, results are printed to your terminal, and you
can copy-paste them.
The bad thing with piping the results is that you don't see some information like
e.g., system reboot.

journalctl --unit haproxy | grep  "Segmentation fault"
journalctl --unit haproxy | grep --count "Segmentation fault"
journalctl --unit haproxy --grep "Segmentation fault"

If you want to immediately see the last, e.g., 1000 lines of the journal log, you can use
--pager-end (-e for short) to jump to the end and --lines 1000 (-n) to show you
the last 1000 lines.

$ journalctl --pager-end --lines 1000

Show kernel messages from the last boot:

$ journalctl --boot

To see system log messages, we need to filter by identifier. You can also use --identifier
for other services (e.g., haproxy).

$ systemctl --identifier kernel
$ systemctl --identifier haproxy

When you are doing live monitoring, you will want to use the --follow (-f) flag:

$ systemctl --unit haproxy --follow

Ref:

[NixCon 2019] Reading Nix expressions (notes)

This are my notes from the Reading Nix expressions talk in NixCon 2019.

Nix package language has a similar syntax to the "JSON meets functional language concepts". E.g. you have Let-expressions which I would say is a functional concept (ref: https://en.wikipedia.org/wiki/Let_expression) - I first saw it in Elm and Haskell.

An important thing to keep in mind is that Nix is a lazy language so some code might not get evaluated if it is not called. This means you can actually read the code from the "middle". For example, you can start reading this config https://github.com/jwiegley/nix-config/blob/master/default.nix in line 12 and then continue to the next one. Once you see a variable like e.g. ${version} you can "evaluate" it in your head (in this case version is defined in line 1).

The point is that some variables/functions can be defined but might not get called so Nix won't evaluate them. This is completely different from how e.g. Python works.

Functions

Nix functions are just lambda expressions. Example:

a: b: a + b

Another representation is keyword arguments or a set pattern:

{ a, b }: a + b

This one is the most popular. You can call this function with a and b attributes. If you need additional arguments you can use an ellipsis (...):

{ a, b, c, ... }: a + b + c

This will work on any set that contains at least the three named attributes.

Ref: https://nixos.org/nix/manual/#ss-functions

Buildins

Nix expression evaluator has a bunch of functions and constants built in. For example:

  • toString e (Convert the expression e to a string)
  • import path (Load, parse and return the Nix expression in the file path)
  • throw s (Throw an error message s. This usually aborts Nix expression evaluation)
  • map f list (Apply the function f to each element in the list list)

There are tons of other [buildins available][https://nixos.org/nix/manual/#ssec-builtins].

Keywords

We have several reserved keywords in Nix:

Operators

Nix also provides a bunch of operators like Arithmetic addition, subtraction, division, etc.

See this link for the full list of operators.


This is the very basic of the Nix language. You will need the above knowledge if you will want to start hacking nix config files. Or should I say if you want to know what you are actually doing while hacking nix config files ūüôā

Python3 strings

A bit of background on unicode and UTF-8:

Unicode has a different way of thinking about characters. In Unicode, the letter ‚ÄúA‚Äú is a platonic ideal. It‚Äôs just floating in ‚Äúheaven‚ÄĚ. Every platonic letter in every alphabet is assigned a magic number by the Unicode consortium which is written like this: U+0639 (in python ‚Äú\u0639‚Äú).

UTF-8 is a system of storing your string of unicode code points (those magic “U+number“) in memory using 8 bit bytes.

One of the common questions for python 3 is when to use bytestring and when to use strings as an object? When you are manipulating string (e.g. “reversed(my_string)“) you always use string object and newer bytestring. Why? Here is an example:

 

my_string = "I owe you £100"
my_bytestring = my_string.encode()

>>> print(''.join([c for c in reversed(my_string)]))
001£ uoy ewo I
>>> print(''.join([chr(c) for c in reversed(my_bytestring)]))
001¬£√ā uoy ewo I

You should never call encode without specifying which encoding to use because then¬†the interpreter will pick for you which will ‚Äúalmost‚ÄĚ always be UTF-8 but there are some instances where this won‚Äôt be so and you will spent a lot of time finding this bug. So ALWAYS specify which encoding to use (e.g. ‚Äú.encode(‚Äėutf-8‚Äô)‚Äú). Example:

 

>>> print('I owe you £100'.encode('utf-8').decode('latin-1'))
I owe you √ā¬£100

Full article can be found here: link

Python PDB

Last month I attended PyMunich conference and in this blog post I want to highlight a talk by Philip Bauer (“Debug like a pro. How to become a better programmer through pdb-driven development“).

Basic commands for pdb that Philip highlighted:

  • ¬†l[ist] (list source code of current file)
  • n[ext] (continue execution until next line)
  • s[tep] (execute the current line, stop at the first possible occasion)
  • r[eturn] (continue execution until the current function returns)
  • c[ontinue] (continue execution, only stop when a breakpoint is encountered)
  • w[here] (show stack trace, recent frame at bottom)
  • u[p] (move up the stack)
  • d[own] (move down the stack)
  • b[reakpoint] (set a new breakpoint. `tbreak` for temporary break points)
  • a[rgs] (print the argument list of the current function)

`ipdb`/`pdbpp` also have long list method (`ll`) which displays the whole function you are in.

Other python debugging tricks you should know about are:

  • use ?for getting additional information lib/class/function/... (e.g. os?)
  • use ??for displaying the source code of the lib/class/function you want to inspect (e.g. os.path.join??)
  • pp(Pretty-print) is already in pdb so you should always use it
  • pp locals()will pretty print local variables

One of the best tricks is the `help` function which accepts object and returns generated help page for the object. !help(obj.__class__)command will generate help page which will contain all the methods including class methods and static methods with docstrings, method resolution order, data descriptors, attributes, data and other attributes inherited and much more.

Full article about PyMunich 2016 is located here: link.

Releasing Python Packages

When you need to fix a bug, add feature or change existing functionality in a library that you are using there are several ways to do this. Here I will show you how to do this properly:

  1. Fork repository.
  2. Clone into your local dir.
  3. Create new branch and add your code (do not forget to add tests!).
  4. Commit & push, create PR on your forked repo, merge.
  5. Create PR to the base project so this change can go upstream (and then the whole community benefits from it).
  6. Release your updated version of this lib.

Step 3 can be a bit tricky to do because often you need to test this new code in your existing project. My favourite way to test if your new code works is to install this lib (package) in "develop mode":

pip install -e /src/my_modified_lib

Step 6 is needed if you have continuous integration like TravisCI. For this you also need your own private pypi server or have an account on official python pypi server. To create a release package that you can upload on pypi server just type:

python setup.py sdist --formats=zip

When releasing package, MANIFEST.in file is very important so if your files are missing in the .zip package, the problem is with MANIFEST.in file.

Then just upload this package (located in `dist` folder) to pypi server. Next thing you will need to do is to tell your buildout process where this new package is located. This is unfortunately very different based on which buildout tool you use (e.g. if you are using buildout tool you need to add link to this package to the "find-links" list).

Lessons learned from EuroPython 2016

This was my first EuroPython conference and I had high expectations because I heard a lot of good things about it. I must say that overall it didn’t let me down. I learned several new things and met a lot of new people. So lets dive straight into the most important lessons.

On Tuesday I attended ‚ÄúEffective Python for High-Performance Parallel Computing‚ÄĚ training session by Michael McKerns. This was by far my favorite training session and I have learned a lot from it. Before Michael started with code examples and code analysis he emphasized two things:

  1. Do not assume what you hear/read/think. Time it and measure it.
  2. Stupid code is fast! Intelligent code is slow!

At this point I knew that the session is going to be amazing. He gave us a github link (https://github.com/mmckerns/tuthpc) where all examples with profiler results were located. He stressed out that we shouldn’t believe him and that we should test them ourselves (lesson #1).

I strongly suggest to clone his github repo (https://github.com/mmckerns/tuthpc) and test those examples yourself. Here are my quick notes (TL; DR):

  • always compile regular expressions
  • use local variables (true = True, local = GLOBAL)
  • if you know how many elements it will be in your list, create it with None elements and then fill it (L = [None] * N)
  • when inserting item on 0 index in a list use append then reverse (O(n) vs O(1))
  • use built-in functions, use built-in functions, use built-in functions!!! (they are written in C layer)
  • when extending list use¬†.extend()¬†and not +
  • searching in set (hash map) is a lot faster then searching in list (O(1) vs O(n))
  • constructing set is much slower then list so you usually don‚Äôt want to transform list into set and then search in it because it will be slower. But again you should test it
  • +=¬†doesn‚Äôt create new instance of an object so use this in loops
  • list comprehension is better than generator. for loop is better then generator and sometimes also than list comprehension (you should test it!)
  • importing is expensive (e.g. numpy is 0.1 sec)
  • switching between python arrays and numpy arrays is very expensive
  • if you start writing intelligente and complex code you should stop and rethink if there is more stupid way of achieving your goal (see lesson #2)
  • optimize the code you want to run in parallel. This is more important than to just run it in parallel.

Here is a full blog post that I have written for NiteoWeb.

Unit testing in Python with mock

Before we start writing tests let's make sure that we understand why do we want to write unit tests and the concept of unit testing. Here are a few reasons (from my experience) why is a good idea to write tests:

  • In more complex projects you can't (or it's very hard) simulate the error that you have found in your error.log.¬†So if you write unit test that will simulate that error you can then fix your code and when test pass you know you have fix the bug.
  • It is very hard to check if¬†some¬†functionality is working when it gets its input from some other service that isn't build yet (or you only know how it behaves). You can write unit test that will simulate (mock) that service.
  • On the long run every test you write will save you time. FACT!

For conceptual part I will use this quote:

As a developer, you care more that your library successfully called the system function for ejecting a CD as opposed to experiencing your CD tray open every time a test is run.

 

Ok lets look at some examples. First we will start with some really simple ones (that I have "stolen" from toptal):

This is our code that we want to test:

# -*- coding: utf-8 -*-

import os

def rm(filename):
    os.remove(filename)py

With mock lib we can easily test our code like this:

# -*- coding: utf-8 -*-

from mymodule import rm

import mock
import unittest

class RmTestCase(unittest.TestCase):
    
    @mock.patch('mymodule.os')
    def test_rm(self, mock_os):
        rm("any path")
        # test that rm called os.remove with the right parameters
        mock_os.remove.assert_called_with("any path")

Note: we test only if os.remove is successfully called with correct arguments.

Let see how we can test Pyramid/Django view function:

@view_config(
    route_name='generate_deeplink',
    renderer='myProject:templates/generate_deeplink.mako',
    permission="view",
)
def generate_deeplink(request):
    if request.POST:
        try:
            booking = ExternalAPI(
                request.POST["reserv_number"].strip(),
                request.POST["email"].strip(),
            ).get()
            return dict(booking=booking)
        except AdministrationException as e:
            request.session.flash(
                'AdministrationException: {}'.format(e))

    return dict()

And the test for this piece of code can be something like this:

BOOKING_INFO_RS = """
<RS>
  <Administration>
    <Errors/>
  </Administration>
  <Responses>
    <ReservationRS status="PENDING" test="tist">
      <ReservNo>123456789</ReservNo>
    </ReservationRS>
  </Responses>
</RS>
"""

@mock.patch('myProject.views.ExternalAPI')
def test_generate_deeplink(self, external_API_rq):
    req = testing.DummyRequest(post={
        'reserv_number': '12345678 ',
        'email': 'john.doe@example.com',
    })
    external_API_rq().get.return_value = BOOKING_INFO_RS

    result = views.generate_deeplink(req)
    external_API_rq.assert_called_with(
        '12345678',
        'john.doe@example.com',
    )
    self.assertEqual(result['booking'], BOOKING_INFO_RS)

So before we call views.generate_deeplink(req) we create dummy request and mock ExternalAPI and we set  ExternalAPI.get() function return value. After we call our view we start checking if ExternalAPI was called with the correct params and if the result is what we expect.

 

Here is a bit more complex example that is also using very useful freezegun lib.

@view_config(route_name='send_deeplink', permission="view")
def send_deeplink(request):
    _ = request.localizer.translate
    reserv_number = request.matchdict['reserv_number']
    email = request.matchdict['email']
    booking = ExternalAPI(reserv_number, email).get()

    json_querystring = ExternalAPI_checking_something(
        email,
        reserv_number,
        lang=langs.get(request.localizer.locale_name),
        data1=booking["data1"],
        data2=booking["data2"],
        data3=booking["data3"],
    ).query()

    deeplink = request.route_url(
        "some_route",
        lang=request.localizer.locale_name,
        query=json_querystring,
        booking_number=reserv_number,
        email=email,
    )

    email_template = Template(
        filename=request.registry.get("BASE") +
        '/myProject/customer/templates/mail_deeplink.mako',
        input_encoding='utf-8',
        output_encoding='utf-8',
    )

    body = email_template.render(
        request=request,
        deeplink=deeplink,
        booking=booking,
        date=datetime.now().strftime("%d.%m.%Y"),
        _=_,
    ).decode('utf_8')

    author = request.registry.get("mail_default_sender")
    subject = _("SUBJECT")

    message = Message(
        subject=subject,
        sender=author,
        recipients=[email],
        html=body,
    )

    request.registry['mailer'].send_immediately(message, fail_silently=False)

    request.session.flash('Deeplink successfully sent.')
    return HTTPFound(location=request.route_url('generate_deeplink'))

And our unit test:

# -*- coding: utf-8 -*-
from freezegun import freeze_time
from pyramid import testing
from pyramid.httpexceptions import HTTPFound
from myProject import views

import mock
import os
import unittest


BOOKING_INFO_DICT = {
    'data1': {
        'id': '15',
        'time': '12:00',
    },
    'data2': {
        'id': '12',
        'time': '12:00',
    },
    'data3': '29',
}
QUERY_STRING = 'param_1%3Dvalue_1%2Bparam_2%26param_3%3Dvalue_3%26'
BASE_PATH = os.path.abspath(os.path.join(os.pardir, os.pardir))
DEEPLINK = 'this is mocked deeplink'


@freeze_time("2015-01-14 12:00:01")
class TestDeeplink(unittest.TestCase):

    def setUp(self):
        self.config = testing.setUp()

    def tearDown(self):
        testing.tearDown()

    @mock.patch('myProject.views.Message')
    @mock.patch('myProject.views.Template')
    @mock.patch('myProject.views.ExternalAPI_checking_something')
    @mock.patch('myProject.views.ExternalAPI')
    def test_generate_deeplink(
        self,
        external_API_rq,
        external_API_2_rq,
        check_booking,
        mock_Template,
        mock_Message,
    ):
        req = testing.DummyRequest()
        req.localizer.locale_name = 'en'
        req.registry = {
            'BASE': BASE_PATH,
            'mail_default_sender': 'admin@localhost.com',
            'mailer': mock.MagicMock()
        }
        req.registry['mailer'].send_immediately = mock.MagicMock()
        req.matchdict = {
            'email': 'john.doe@example.com',
            'reserv_number': '12345678',
        }
        external_API_2_rq().query.return_value = QUERY_STRING
        external_API_rq().get.return_value = BOOKING_INFO_DICT
        req.route_url = mock.MagicMock()
        req.route_url.return_value = DEEPLINK
        mock_Template().render.return_value = 'rendered html email'
        req.session.flash = mock.MagicMock()
        mock_Message.return_value = 'awesome email'

        result = views.send_deeplink(req)

        external_API_rq.assert_called_with(
            '12345678',
            'john.doe@example.com',
        )

        # because `request.route_url` is called 2 times we must check if it
        # was called in the right order and with the correct params
        req.route_url.assert_has_calls([
            mock.call(
                'some_route',
                lang='en',
                query=QUERY_STRING,
                booking_number='12345678',
                email='john.doe@example.com',
            ),
            mock.call('generate_deeplink'),
        ])

        # checking if `Template` was called with the correct params
        mock_Template.assert_called_with(
            filename=os.path.join(
                BASE_PATH,
                'myProject/templates/mail_deeplink_bookRQ.mako',
            ),
            input_encoding='utf-8',
            output_encoding='utf-8',
        )

        # checking if `Template.render()` was called with the correct params
        mock_Template().render.assert_called_with(
            request=req,
            deeplink=DEEPLINK,
            booking=BOOKING_INFO_DICT,
            date='14.01.2015',  # time that we have set with `@freeze_time`
            _=req.localizer.translate,
        )

        # ...
        mock_Message.assert_called_with(
            subject='SUBJECT',
            sender='admin@localhost.com',
            recipients=['john.doe@example.com'],
            html='rendered html email',
        )

        # ...
        req.registry['mailer'].send_immediately.assert_called_with(
            'awesome email',
            fail_silently=False,
        )

        # ...
        req.session.flash.assert_called_with(
            'Deeplink successfully sent.')

        # checking if the return result is redirect - HTTPFound instance
        self.assertIsInstance(result, HTTPFound)

 

First we mock variables and functions and then we set return values for these functions (e.g. mock_Template().render.return_value = 'rendered html email' ). After the stage is set we call our view ( result = views.send_deeplink(req) ) and then we start checking what was called ( e.g. ...assert_called_with(...) ).

One thing to remember is to always set your mock functions and return values BEFORE you call your view and check what was called AFTER you call your view ( result = views.send_deeplink(req) ).

 

MongoDB not starting up

A few days ago I had a problem where mongodb crashed and I couldn't start it up. I realized that this problem occurred because the disk was full. So I  rm -rf  a few things and try to sudo service mongod start . But this wasn't successful. I googled this problem and I quickly found the solution:

 

sudo rm /var/lib/mongodb/mongod.lock
sudo mongod --repair
sudo service mongodb start

After this mongodb started without any problems.

Navigation