Category Archives: Tech

WP Meetups

A few months back I noticed we actually have regular WordPress Meetups in Ljubljana, our base town. We attended one in April, where David talked about theming and Emanuel about bringing WordPress into the Public Sector. On the second one, in June, we were active participants: Janez and myself delivered a talk titled Lessons learned running 25k WordPress blogs describing how we scaled Easy Blog Networks to 25k blogs running on several hundred servers.

Both events also had a Lightning Talks section, which is what I normally enjoy the most. So many great ideas packed into such a short timeframe. Looking forward to the next meetup that should happen sometime in Autumn!

img_8157 img_7930 img_7932vlcsnap-2016-09-11-11h59m58s941

Lessons learned from EuroPython 2016

This was my first EuroPython conference and I had high expectations because I heard a lot of good things about it. I must say that overall it didn’t let me down. I learned several new things and met a lot of new people. So lets dive straight into the most important lessons.

On Tuesday I attended “Effective Python for High-Performance Parallel Computing” training session by Michael McKerns. This was by far my favorite training session and I have learned a lot from it. Before Michael started with code examples and code analysis he emphasized two things:

  1. Do not assume what you hear/read/think. Time it and measure it.
  2. Stupid code is fast! Intelligent code is slow!

At this point I knew that the session is going to be amazing. He gave us a github link (https://github.com/mmckerns/tuthpc) where all examples with profiler results were located. He stressed out that we shouldn’t believe him and that we should test them ourselves (lesson #1).

I strongly suggest to clone his github repo (https://github.com/mmckerns/tuthpc) and test those examples yourself. Here are my quick notes (TL; DR):

  • always compile regular expressions
  • use local variables (true = True, local = GLOBAL)
  • if you know how many elements it will be in your list, create it with None elements and then fill it (L = [None] * N)
  • when inserting item on 0 index in a list use append then reverse (O(n) vs O(1))
  • use built-in functions, use built-in functions, use built-in functions!!! (they are written in C layer)
  • when extending list use .extend() and not +
  • searching in set (hash map) is a lot faster then searching in list (O(1) vs O(n))
  • constructing set is much slower then list so you usually don’t want to transform list into set and then search in it because it will be slower. But again you should test it
  • += doesn’t create new instance of an object so use this in loops
  • list comprehension is better than generator. for loop is better then generator and sometimes also than list comprehension (you should test it!)
  • importing is expensive (e.g. numpy is 0.1 sec)
  • switching between python arrays and numpy arrays is very expensive
  • if you start writing intelligente and complex code you should stop and rethink if there is more stupid way of achieving your goal (see lesson #2)
  • optimize the code you want to run in parallel. This is more important than to just run it in parallel.

Threading and multiprocessing:

  • you should always run analysis if/when threading/multiprocessing is faster. If you are using simple functions it will probably be slower
  • in parallel computing you need to catch and log errors
  • in parallel computing you always want your functions to return value
  • in parallel computing you never want your code to “die”. Always try to return reasonable default value even if an exception is raised. Slightly wrong is better than not getting an answer!
  • when using threading/multiprocessing use .map() and if you don’t care about the order use .imap_unordered(). It is the fastest because it returns the first available value.
  • if you have stop condition use .imap_unordered()
  • be aware of random module problems. Random seed gets copied to all processes. Result is “random doesn’t work”. You need to create random_seed function and ensure that you are in different random state.
  • is there any general rule when to use threads and when multiprocessing? Use threads if you have light jobs (i.e. they execute in 0-1 sec)

Another interesting talk was about code review (Another pair of eyes: Reviewing code well by Adam Dangoor). He pointed out that one of the most important things with the process of reviewing the code is to share knowledge. When you review others code you learn a lot especially if you take your time and try to really understand what he/she was trying to achieve. It is also recommended to always say something nice about the code especially when reviewing the code of junior developer. And when you think that the code you are reviewing has a bug, write a test that proves it.

EuroPython 2016 was really an amazing experience that every Python developer/scientist should experience. I’m really looking forward to EuroPython 2017!

Dear Plone, welcome to year 2014

TL;DR: Production-level Plone on free-tier Heroku: https://github.com/niteoweb/heroku-buildpack-plone

First, a bit of history: it was year 2006 and I was realizing that I was not made to be an academic. I made my first strides into entrepreneurship and being in IT, the first logical step was to create a few websites and try to get paid for my work. I did have some programming experience but haven’t done any web development yet. I heard PHP was not the ideal solution so I started looking elsewhere. In my student association I heard about this Plone thingy and gave it a go: I purchased a 20€ per month shared Plone hosting account at Nidelven IT and started hacking. It was the Plone 2.x era and boy was I productive! I threw in content, installed a ready-made theme and did some TTW tweaks. Done! First customer happy! Rinse & repeat, upgrade to a beefier server, rinse & repeat. NiteoWeb Ltd. was born.

Fast forward to a couple of months ago: we used GeckoBoard to drive a wall-mounted dashboard display. GeckoBoard works fine, but they want almost $60 per month if you want to drive more than one display. Sounds quite expensive for a bit of HTML and JavaScript, doesn’t it? So I looked around for alternatives and one of them was the Dashing dashboard from Spotify. I was reluctant to even give it a try as it was a self-hosted Ruby app. And I didn’t know *any* Ruby. But there, in their documentation, I found a short guide on how to deploy your very own version of a sample dashbord to your personal Heroku account. And when I say short, I mean short! I copy/pasted 6 simple commands into my console and that was it! My very own dashboard! After it was running I was motivated enough to read through Dashing’s docs and did a few minor tweaks. One “git push heroku master” later, my changes were again deployed to Heroku and showing up on my dashboard display. Wow, is this developer-friendly or what!

My mind drifted and I thought: Boy wouldn’t it be nice if a non-Plone person could come to a Plone add-on page and create their very own Plone instance with the add-on installed, and they could immediately start using it. Uhm … why not? Why don’t we, as a community, provide “private” demos that people can use? Is there something that prevents us from using Heroku for demos, the same was as Dashing, and many other Ruby products, use it?

Turns out, there isn’t. During the Plone dinner at EuroPython 2014 conference last week, I ordered a few beers and got hacking. Goal: get Plone to run on a free-tier Heroku account.

There have been attempts to run Plone on Heroku before, but they failed because they took the wrong approach. Look, Heroku, by default, supports Python apps that are installed with “pip“. Previous attempts were all focused on fixing Plone so it could be installed with pip. And they failed, Plone’s ecosystem is just too complex.

I decided to take a different approach: buildpacks. Heroku allows you to build *anything* on it. So I created a buildpack that supports zc.buildout. Once I got that done, it was not far from getting Plone installed on Heroku.

The next roadblock came in the form of Heroku’s ephemeral filesystem. Everytime your Heroku “dyno” is restarted, the filesystem is recreated. And you lose your Data.fs. Humpff. Wait, but what about the PostgreSQL that Heroku offers? A production-quality PostgreSQL, for free, with a 10k row limit. That could work! So, add in support for RelStorage and you have a production-ready Plone site running on Heroku. For free. And you are not limited to one, you can have as many as you wish, on the same account. Heroku really is awesome!

So, Plone is suddenly again a viable option for college dropouts starting their businesses! No need for system administration knowledge, how to setup Nginx in front of Plone, how to do proper backups, just a few command-line commands and your site is online!

And, our add-ons can now have demos. If you use Data.fs instead of PostgreSQL, the demo instance’s data will be recreated at least once per day, giving visitors an up-to-date demo instance, displaying what your Plone add-on does and how it looks.

Does this really works? Hell yeah it does! This blog has been running on Heroku since last week! And here’s a Plone 4.3 demo, a Plone 5 demo and a collective.cover demo. Wanna see plone.app.mosaic in action?

Why doesn’t your add-on have a demo yet? Follow instructions on https://github.com/niteoweb/heroku-buildpack-plone and showcase your add-on to the world!

NiteoWeb attended a Pyramid sprint in Halle, Germany

Gocept, a company based in Halle (Saale), Germany, organized a Pyramid sprint, which lasted from 15th to 17th August 2013. The sprint took place at their headquarters which, by the way, has a lovely garden perfectly suited for relaxation, eating, drinking and development (not necessarily in that order!).

A bunch of NiteoWeb former and present employees took part, too. Despite the fact that Halle is about 10 hours away from Ljubljana if you go by car (quite a long ride indeed!), but it was well worth coming there. Gocept did an excellent job of feeding and entertaining all the sprint participiants and it was a great pleasure to meet Chris McDonough, author of the Pyramid web framework, in person. An interesting and amusing dude I must say.

Happy sprint participants were quite productive and a whole bunch of bug fixes and enhancements have been made – see the wrap up for more details. The only real downside was that the sprint ended so early, three days really passed in the blink of an eye. But hey, isn’t that always the case when you’re having fun?

So Gocept, thanks again for everything and hope to see you in 2014!

Setuptools – run custom code in setup.py

A week or so ago I started developing an experimental Python package for one of our projects. At some point I realized that it would be convenient to automatically execute some additional initialization code during the package installation process (i.e. when “python setup.py install” is run).

This can be achived by subclassing the setuptools.command.install class and overriding its run() method, like this (in setup.py):

from setuptools import setup
from setuptools.command.install import install


class CustomInstallCommand(install):
    """Customized setuptools install command - prints a friendly greeting."""
    def run(self):
        print "Hello, developer, how are you? :)"
        install.run(self)


setup(
    ...

NOTE: We reference the parent class’ run method directly – we can’t use super(…).run(self), because setuptools commands are old-style Python classes and super() does not support them.

Now that we have a customized install class, we must tell the setuptools machinery to actually use it instead of the built-in version. We do this through the cmdclass parameter of the setup() function:

...

setup(
    ...

    cmdclass={
        'install': CustomInstallCommand,
    },

    ...
)

The value of the cmdclass parameter should be a dictionary whose keys are the names of the setuptools commands we’re customizing (‘install’ in our case), while the corresponding values are our custom command classes we have defined eariler (CustomInstallCommand in this example).

BONUS

Sometimes you will want to apply the the same modification to more than a single command class. For instance your package could also be installed in development mode (by running python setup.py develop), meaning that the setuptools.command.develop class should be overriden as well in order for your modifications of the installation procedure to have any effect in this scenario, too.

A straightforward approach would be to implement another class (e.g. CustomDevelopCommand) similar to the the existing CustomInstallCommand class, but this would violate the DRY principle (“don’t repeat yourself”). What you can do is to define a decorator which accepts command class as a parameter, modifies its run() method and returns a modified version of the class.

Here’s an example:

from setuptools import setup
from setuptools.command.develop import develop
from setuptools.command.install import install


def friendly(command_subclass):
    """A decorator for classes subclassing one of the setuptools commands.

    It modifies the run() method so that it prints a friendly greeting.
    """
    orig_run = command_subclass.run

    def modified_run(self):
        print "Hello, developer, how are you? :)"
        orig_run(self)

    command_subclass.run = modified_run
    return command_subclass

...

@friendly
class CustomDevelopCommand(develop):
    pass

@friendly
class CustomInstallCommand(install):
    pass


setup(
    ...

It’s very simple – we just replace the run() method of a command class with our customized version of it and then apply the decorator where necessary. If we later need to replace the greeting with something different, we only have to change the code in one place.

NOTE: Do not forget to provide the right value of the cmdclass parameter to the setup() function.

By the way – you might be looking at the decorator code and wondering why we explicitly store a reference (‘orig_run’) to the original run method. The reason is we can’t simply call command_subclass.run() in modified_run function directly, because that would cause an infinite loop!
Just look at the code carefully – at the end of the decorator, command_subclass.run becomes a reference to modified_run. If modified_run then calls command_subclass.run(self) in its body, it actually calls itself – again and again and again, until maximum recursion depth is exceeded. Explicitly storing a reference to the original run() method is thus not redunant at all, it’s simply necessary.