Category: Uncategorized

Lessons learned from EuroPython 2016

This was my first EuroPython conference and I had high expectations because I heard a lot of good things about it. I must say that overall it didn’t let me down. I learned several new things and met a lot of new people. So lets dive straight into the most important lessons.

On Tuesday I attended “Effective Python for High-Performance Parallel Computing” training session by Michael McKerns. This was by far my favorite training session and I have learned a lot from it. Before Michael started with code examples and code analysis he emphasized two things:

  1. Do not assume what you hear/read/think. Time it and measure it.
  2. Stupid code is fast! Intelligent code is slow!

At this point I knew that the session is going to be amazing. He gave us a github link (https://github.com/mmckerns/tuthpc) where all examples with profiler results were located. He stressed out that we shouldn’t believe him and that we should test them ourselves (lesson #1).

I strongly suggest to clone his github repo (https://github.com/mmckerns/tuthpc) and test those examples yourself. Here are my quick notes (TL; DR):

  • always compile regular expressions
  • use local variables (true = True, local = GLOBAL)
  • if you know how many elements it will be in your list, create it with None elements and then fill it (L = [None] * N)
  • when inserting item on 0 index in a list use append then reverse (O(n) vs O(1))
  • use built-in functions, use built-in functions, use built-in functions!!! (they are written in C layer)
  • when extending list use .extend() and not +
  • searching in set (hash map) is a lot faster then searching in list (O(1) vs O(n))
  • constructing set is much slower then list so you usually don’t want to transform list into set and then search in it because it will be slower. But again you should test it
  • += doesn’t create new instance of an object so use this in loops
  • list comprehension is better than generator. for loop is better then generator and sometimes also than list comprehension (you should test it!)
  • importing is expensive (e.g. numpy is 0.1 sec)
  • switching between python arrays and numpy arrays is very expensive
  • if you start writing intelligente and complex code you should stop and rethink if there is more stupid way of achieving your goal (see lesson #2)
  • optimize the code you want to run in parallel. This is more important than to just run it in parallel.

Here is a full blog post that I have written for NiteoWeb.

PostgreSQL quick examples

My cheat sheet for postgresql. Also pgcli is a really nice tool to use. IMO with it you don't need any "phpMyAdmin" tools.

Update query:

UPDATE blogs
SET posts=0,
pages=0,
plugins=0,
health_score=0
WHERE server_id IN (185,113,1,3,4);

Select query:

SELECT posts, pages, plugins, health_score
FROM blogs
WHERE server_id IN (185,113,1,3,4);

Check if entity is multiple times in table:

SELECT blog_id, COUNT(*) 
FROM cleanup_notifications 
GROUP BY blog_id 
HAVING COUNT(*) > 1;

Get datatypes for a table:

SELECT column_name, data_type 
FROM information_schema.columns 
WHERE table_name = '<your_table>';

or \d <table_name> in psql / pgcli command line

Set public_url to 'http://...' + secure token from the specific camera for all monitors that belongs to user with ID = 9

UPDATE monitors as m 
SET public_url = 'http://...' || (SELECT secure_token FROM cameras as d WHERE m.domain_id=d.id)
WHERE m.user_id = 9;

 

More Python mocking: dictionary

In previous post we look at how basic mocking works and some examples. Today we will see how you can mock dict :

post = {
    "name": "test",
    "content": "tist",
}
data = mock.MagicMock()
data.__getitem__.side_effect = lambda key: post[key]

If we copy paste this into interpreter we can check the results:

>>> data["name"]
>>> 'test'

Happy testing ...

 

Unit testing in Python with mock

Before we start writing tests let's make sure that we understand why do we want to write unit tests and the concept of unit testing. Here are a few reasons (from my experience) why is a good idea to write tests:

  • In more complex projects you can't (or it's very hard) simulate the error that you have found in your error.log. So if you write unit test that will simulate that error you can then fix your code and when test pass you know you have fix the bug.
  • It is very hard to check if some functionality is working when it gets its input from some other service that isn't build yet (or you only know how it behaves). You can write unit test that will simulate (mock) that service.
  • On the long run every test you write will save you time. FACT!

For conceptual part I will use this quote:

As a developer, you care more that your library successfully called the system function for ejecting a CD as opposed to experiencing your CD tray open every time a test is run.

 

Ok lets look at some examples. First we will start with some really simple ones (that I have "stolen" from toptal):

This is our code that we want to test:

# -*- coding: utf-8 -*-

import os

def rm(filename):
    os.remove(filename)py

With mock lib we can easily test our code like this:

# -*- coding: utf-8 -*-

from mymodule import rm

import mock
import unittest

class RmTestCase(unittest.TestCase):
    
    @mock.patch('mymodule.os')
    def test_rm(self, mock_os):
        rm("any path")
        # test that rm called os.remove with the right parameters
        mock_os.remove.assert_called_with("any path")

Note: we test only if os.remove is successfully called with correct arguments.

Let see how we can test Pyramid/Django view function:

@view_config(
    route_name='generate_deeplink',
    renderer='myProject:templates/generate_deeplink.mako',
    permission="view",
)
def generate_deeplink(request):
    if request.POST:
        try:
            booking = ExternalAPI(
                request.POST["reserv_number"].strip(),
                request.POST["email"].strip(),
            ).get()
            return dict(booking=booking)
        except AdministrationException as e:
            request.session.flash(
                'AdministrationException: {}'.format(e))

    return dict()

And the test for this piece of code can be something like this:

BOOKING_INFO_RS = """
<RS>
  <Administration>
    <Errors/>
  </Administration>
  <Responses>
    <ReservationRS status="PENDING" test="tist">
      <ReservNo>123456789</ReservNo>
    </ReservationRS>
  </Responses>
</RS>
"""

@mock.patch('myProject.views.ExternalAPI')
def test_generate_deeplink(self, external_API_rq):
    req = testing.DummyRequest(post={
        'reserv_number': '12345678 ',
        'email': 'john.doe@example.com',
    })
    external_API_rq().get.return_value = BOOKING_INFO_RS

    result = views.generate_deeplink(req)
    external_API_rq.assert_called_with(
        '12345678',
        'john.doe@example.com',
    )
    self.assertEqual(result['booking'], BOOKING_INFO_RS)

So before we call views.generate_deeplink(req) we create dummy request and mock ExternalAPI and we set  ExternalAPI.get() function return value. After we call our view we start checking if ExternalAPI was called with the correct params and if the result is what we expect.

 

Here is a bit more complex example that is also using very useful freezegun lib.

@view_config(route_name='send_deeplink', permission="view")
def send_deeplink(request):
    _ = request.localizer.translate
    reserv_number = request.matchdict['reserv_number']
    email = request.matchdict['email']
    booking = ExternalAPI(reserv_number, email).get()

    json_querystring = ExternalAPI_checking_something(
        email,
        reserv_number,
        lang=langs.get(request.localizer.locale_name),
        data1=booking["data1"],
        data2=booking["data2"],
        data3=booking["data3"],
    ).query()

    deeplink = request.route_url(
        "some_route",
        lang=request.localizer.locale_name,
        query=json_querystring,
        booking_number=reserv_number,
        email=email,
    )

    email_template = Template(
        filename=request.registry.get("BASE") +
        '/myProject/customer/templates/mail_deeplink.mako',
        input_encoding='utf-8',
        output_encoding='utf-8',
    )

    body = email_template.render(
        request=request,
        deeplink=deeplink,
        booking=booking,
        date=datetime.now().strftime("%d.%m.%Y"),
        _=_,
    ).decode('utf_8')

    author = request.registry.get("mail_default_sender")
    subject = _("SUBJECT")

    message = Message(
        subject=subject,
        sender=author,
        recipients=[email],
        html=body,
    )

    request.registry['mailer'].send_immediately(message, fail_silently=False)

    request.session.flash('Deeplink successfully sent.')
    return HTTPFound(location=request.route_url('generate_deeplink'))

And our unit test:

# -*- coding: utf-8 -*-
from freezegun import freeze_time
from pyramid import testing
from pyramid.httpexceptions import HTTPFound
from myProject import views

import mock
import os
import unittest


BOOKING_INFO_DICT = {
    'data1': {
        'id': '15',
        'time': '12:00',
    },
    'data2': {
        'id': '12',
        'time': '12:00',
    },
    'data3': '29',
}
QUERY_STRING = 'param_1%3Dvalue_1%2Bparam_2%26param_3%3Dvalue_3%26'
BASE_PATH = os.path.abspath(os.path.join(os.pardir, os.pardir))
DEEPLINK = 'this is mocked deeplink'


@freeze_time("2015-01-14 12:00:01")
class TestDeeplink(unittest.TestCase):

    def setUp(self):
        self.config = testing.setUp()

    def tearDown(self):
        testing.tearDown()

    @mock.patch('myProject.views.Message')
    @mock.patch('myProject.views.Template')
    @mock.patch('myProject.views.ExternalAPI_checking_something')
    @mock.patch('myProject.views.ExternalAPI')
    def test_generate_deeplink(
        self,
        external_API_rq,
        external_API_2_rq,
        check_booking,
        mock_Template,
        mock_Message,
    ):
        req = testing.DummyRequest()
        req.localizer.locale_name = 'en'
        req.registry = {
            'BASE': BASE_PATH,
            'mail_default_sender': 'admin@localhost.com',
            'mailer': mock.MagicMock()
        }
        req.registry['mailer'].send_immediately = mock.MagicMock()
        req.matchdict = {
            'email': 'john.doe@example.com',
            'reserv_number': '12345678',
        }
        external_API_2_rq().query.return_value = QUERY_STRING
        external_API_rq().get.return_value = BOOKING_INFO_DICT
        req.route_url = mock.MagicMock()
        req.route_url.return_value = DEEPLINK
        mock_Template().render.return_value = 'rendered html email'
        req.session.flash = mock.MagicMock()
        mock_Message.return_value = 'awesome email'

        result = views.send_deeplink(req)

        external_API_rq.assert_called_with(
            '12345678',
            'john.doe@example.com',
        )

        # because `request.route_url` is called 2 times we must check if it
        # was called in the right order and with the correct params
        req.route_url.assert_has_calls([
            mock.call(
                'some_route',
                lang='en',
                query=QUERY_STRING,
                booking_number='12345678',
                email='john.doe@example.com',
            ),
            mock.call('generate_deeplink'),
        ])

        # checking if `Template` was called with the correct params
        mock_Template.assert_called_with(
            filename=os.path.join(
                BASE_PATH,
                'myProject/templates/mail_deeplink_bookRQ.mako',
            ),
            input_encoding='utf-8',
            output_encoding='utf-8',
        )

        # checking if `Template.render()` was called with the correct params
        mock_Template().render.assert_called_with(
            request=req,
            deeplink=DEEPLINK,
            booking=BOOKING_INFO_DICT,
            date='14.01.2015',  # time that we have set with `@freeze_time`
            _=req.localizer.translate,
        )

        # ...
        mock_Message.assert_called_with(
            subject='SUBJECT',
            sender='admin@localhost.com',
            recipients=['john.doe@example.com'],
            html='rendered html email',
        )

        # ...
        req.registry['mailer'].send_immediately.assert_called_with(
            'awesome email',
            fail_silently=False,
        )

        # ...
        req.session.flash.assert_called_with(
            'Deeplink successfully sent.')

        # checking if the return result is redirect - HTTPFound instance
        self.assertIsInstance(result, HTTPFound)

 

First we mock variables and functions and then we set return values for these functions (e.g. mock_Template().render.return_value = 'rendered html email' ). After the stage is set we call our view ( result = views.send_deeplink(req) ) and then we start checking what was called ( e.g. ...assert_called_with(...) ).

One thing to remember is to always set your mock functions and return values BEFORE you call your view and check what was called AFTER you call your view ( result = views.send_deeplink(req) ).

 

MongoDB not starting up

A few days ago I had a problem where mongodb crashed and I couldn't start it up. I realized that this problem occurred because the disk was full. So I  rm -rf  a few things and try to sudo service mongod start . But this wasn't successful. I googled this problem and I quickly found the solution:

 

sudo rm /var/lib/mongodb/mongod.lock
sudo mongod --repair
sudo service mongodb start

After this mongodb started without any problems.

Git basics: workflow, pull request, rebasing, …

When I started using git (several years ago) I use only git command line. After a few years I decided to start using SourceTree. I regret this decision to this day :). The problem is that  you can't do a lot of things. Another big problem is that when you change your develop environment where you can't install SourceTree (e.g. remote via SSH) you are "lost". Because there aren't that many git command (that you should be using on a daily basis) there is no good reason that you should't learn them.

So here are some git commands that I use frequently.

 

git checkout LOCAL_OR_REMOTE_BRANCH #  switches to local or remote branch
// "UNDO" COMMANDS
git reset # undo add files
git checkout . # discard all changes
git checkout dir/file # discard changes in dir or specific file
git clean -f # delete all untracked files

// COMMIT COMMANDS:
git commit --amend # append commit changes to the last commit. This is VERY useful. e.g. you forgot to add some file or did some minor  mistake.
git rebase -i HEAD~6 # this will get last 6 commits and you will be able to modify the commits

// LIST COMMANDS
git branch  # list of local branches
git remote -v # list remotes

// DIFF commands
git diff --name-status master..BRANCH  # shows a list of files that were changed
git diff --stat --color master..BRANCH # shows a list of files that were changed. More detailed view
git diff master..BRANCH  # you can cycle through changes
git diff commit_id HEAD # shows difference between current version and `commit_id` version

If you are working on some branch but the master is several commits ahead you can use rebase to add local changes on top. Here is detailed workflow:

git fetch origin  # get fresh version of origin master
git rebase origin/master  #  merge origin/master into your curr branch and add local changes on top
# IF CONFLICTS:
# reslove conflicts & git add
git rebase --continue

Git general workflow:

git pull
git checkout -b BRANCH_NAME
// do your work...
git commit
git checkout master
git merge BRANCH_NAME
git push

"Github" workflow (i.e. creating pull-request):

git pull
git checkout -b BRANCH_NAME
// do your work...
git commit
git push origin BRANCH_NAME

The main difference between general git workflow and github workflow is that on github you (should) always create pull-request. So after you push your branch to remote origin (line 5) you go to this github repository select your new branch and click on that green button "New pull request" (see Image 1).

Image 1: Creating pull request
Image 1: Creating pull request

 

Very useful command is also `revert`. If you already merged your branch into master (which is protected) and you wish to "undo" you can do the folowing.

git revert SHA

SHA is the branch ID that you wish to undo. You can read more about undoing, fixing, etc here: link

NOTE: there are probably a lot more things you should know so I strongly suggest that you use "uncle Google" 🙂 (also there are several great YouTube videos you can watch if you are an absolute beginner).

 

Reset tree to original commit and use rebase not merge:

git reset --hard d27dce5129715f3c32aed376eeca348d142f5398 # initial commit
git fetch -p
git rebase origin/master
# fix merge conflicts (this is not merge, wording is a bit confusing)
git rebase --continue
git cherry-pick dac0848faf4219058b59dce8deb80084ade74828 # Update alembic script down_revision. fix merge conflicts (this is not merge, wording is a bit confusing) etc 
git push...

 

Force “git pull” to overwrite local files:

git fetch --all
git reset --hard origin/<branch_name>

git fetch fetch downloads the latest from remote without trying to merge or rebase anything.

Then the git reset  resets the master branch to what you just fetched. The  --hard option changes all the files in your working tree to match the files in origin/master (Source: link)

 

Changing the timestamp of a previous Git commit:

git filter-branch --env-filter \
"if test \$GIT_COMMIT = '06387f3c078f9f36dc4074d90550eb0b11013607'
then
    export GIT_AUTHOR_DATE='Thu Apr 5 19:54:26 2018 +0200'
    export GIT_COMMITTER_DATE='Thu Apr 5 19:54:26 2018 +0200'
fi" && rm -fr "$(git rev-parse --git-dir)/refs/original/"

 

Some additional reading:

  • A really good post about git rebase: link

Deploy website with Git bare

I tried to follow a few tutorials / howtos but non of them worked perfectly so I decided to write my own how-to. If nothing else I will know which one to follow the next time I came across this problem 🙂

One of the problems (as always) was that some of them were obsolete or did not tell for which git version that how-to is written for...

I was doing this with git version 1.8.3.2

This were my steps:

Server side:

git init --bare
git config core.bare false
git config core.worktree /home/gasper/website
git config receive.denycurrentbranch ignore
cat > hooks/post-receive
#!/bin/sh
git checkout -f
^D
chmod +x hooks/post-receive

NOTE: if you want to deploy a specific branch and not 'master' branch then you must change hooks/post-receive to:

cat > hooks/post-receive
#!/bin/sh
git checkout -f [BRANCH_NAME]
^D

Local machine:

git remote add web ssh://myserver/home/gasper/git/somesite.git
git push web +master:refs/heads/master

 

Here are some references:

If you have any questions leave a comment.

Magento remove supre attribute

To remove one super product attribute (as they are called) from all configurable products, you could execute this SQL query in the database:

The table catalog_product_super_attribute links products to super product attributes. You can add and remove attributes to created configurable products there.

 

DELETE FROM catalog_product_super_attribute WHERE attribute_id =<id>

Source: link

Magento Translation priorities

In Magento you can overwrite code in app/code/local or app/code/comunity. The same concept applies to translations. There are 4 levels (or maybe even more) of translation hierarchy. Let jump right into example:

We have language de_DE. First level (priority) is in app/local/de_DE. Here you should have files that start with "Mage_" and ends with ".csv" (every translate file in Magento is csv so it ends with .csv). You might also have translation files like CustomModule_ModuleName.csv where CustomModule is usually company name that created this module.

I will skip 2nd and 3rd level because they are not so important imo (google them 🙂 ). 4th level has the highest priority. If Magento finds translation on this level it will overwrite ANY other translation that already exists. This translations are located in app/design/frontend/default/THEME_NAME/locale/de_DE/translate.csv where THEME_NAME is the theme you are using. Which theme are you using can be found in Magento backend (google it).

Here you will have only ONE file that is called translate.csv. All the "custom" translation for this language should be here because they are all in one place (not like in app/locale/de_DE where you have tons of files).

 

Also do not forget to CLEAR CACHE!!! 🙂

Remote Sync – Examples

I am mainly a developer and use server just to get my code online so I often forget how to use rsync. So here is my cheat sheet for rsync 🙂

  1. -v : verbose
  2. -r : copies data recursively (but don’t preserve timestamps and permission while transferring data
  3. -a : archive mode, archive mode allows copying files recursively and it also preserves symbolic links, file permissions, user & group ownerships and timestamps
  4. -z : compress file data
  5. -h : human-readable, output numbers in a human-readable format

 

Do a Dry Run with rsync

Use of this option (--dry-run) will not make any changes only do a dry run of the command and shows the output of the command, if the output shows exactly same you want to do then you can remove ‘–dry-run‘ option from your command and run on the terminal.

 

rsync --dry-run --remove-source-files -zvh backup.tar /tmp/backups/

Copy/Sync a Directory on Local computer

 

rsync -avzh /root/rpmpkgs /tmp/backups/

Copy a File from a Remote Server to a Local Server with SSH

 

rsync -avzhe ssh root@192.168.0.100:/root/install.log  /home/destination

Exclude multiple files and directories at the same time

rsync -avzh --exclude file1.txt --exclude dir3/file4.txt SOURCE/ DESTINATION/

 

More detailed examples are found here: Link and link

Navigation