Automatic localtime management in ESP8266 and other low-memory IoT devices

Justification

During the last several years, personal computers and smartphones became capable of displaying the local time, correctly adjusted for daylight saving time (DST) – and without requiring human intervention beyond selecting the correct timezone.

Nowadays, there are also some IoT devices, which need to support local time management – displaying it, or otherwise making it available.

Timekeeping is performed using the Internet protocol NTP, which provides the correct UTC. When using a PC or a smartphone, the timezone is usually selected by manual user action.

However, some IoT devices may not have the UI needed for convenient timezone selection. Then it is desirable to support automatic timezone selection as a default.

How to implement automatic timezone selection?

There are some websites, which discover your IP address and provide you with the best guess of your timezone.

Since those websites usually provide the timezone name rather than the string describing the DST transition dates (the so-called tz_string), the next step is to figure out the DST transition dates from the timezone name.

In devices with plenty of memory this is carried out by means of a timezone database.

For example, in Debian/Ubuntu based systems, this database is stored in the /usr/share/zoneinfo directory and occupies 3.5MB (the relevant package in Ubuntu 18.04 is tzdata and its version, as of Nov. 2019, is 2019c-0ubuntu0.18.04).

Memory constrained IoT devices

However, IoT devices are typically based upon memory-constrained controllers and cannot afford to store locally the whole timezone database – just to correctly determine the local time for a single timezone.

Therefore, IoT devices need to access an Internet based service to get the correct timezone information, just as they get UTC time updates using NTP. In other words, those IoT devices effectively outsource the timezone database management.

Internet service for providing the timezone information

An Internet service, for providing the correct tz_string corresponding to a timezone name, needs to keep the timezone database up to date at all times.

I implemented the internet service as follows.

  1. A machine, running an Ubuntu 18.04 installation with a webserver, is used.
  2. The Internet service is implmented as a small WSGI-based website. It uses the database mentioned below.
  3. A script scans the /usr/share/zoneinfo contents and creates a small database for translating timezone names into the corresponding tz_string values.
  4. There is a mechanism for invoking the above script and restarting the web server each time the tzdata package is updated/installed/re-installed.

Show us the code!

The GitHub project tddpirate/tzdata2tzstring includes redacted versions of both an implementation of the above website and a sample ESP8266 client.

Credits

I wish to thank the Python Israel Telegram group members for advice about selecting a Python WSGI framework. I ended up selecting Falcon because benchmarks indicated that it is faster than Flask and Bottle.

The members of the לינוקס Telegram group deserve thanks, too. They helped me find the mechanism for appending my own postprocessing scripts after a Debian/Ubuntu package installation or upgrade.

Python discovers its inner PHP and JavaScript personae

Did you recently switch from PHP or JavaScript to Python, and are missing the fun of being bitten by your programming language?

The collection of surprising Python snippets and lesser-known features is your ultimate guide for provoking Python to bite you in the arse.

I got the ability to work with Heroku using my Debian Stretch system

The other day I found that:

  1. Heroku needs Python 3.6 or later to work (as of June 22, 2018). See: Getting Started on Heroku with Python.
  2. Debian Stretch (Debian Stable as of June 22, 2018) and its backports have only Python 3.5.

The solution was to build a Docker image based upon Ubuntu 18.04, which does have Python 3.6. See the project https://gitlab.com/TDDPirate/heroku_on_debian in GitLab.

July 15, 2018 update:

After I complained about flakiness of Selenium-based tests when the Selenium server is running outside of the Docker container while the application runs inside the container, Udi Oron suggested another way to run Python 3.6 on a Debian Stretch system: use pyenv.

Turns out that pyenv solves the pain point of running Python 3.6 on Debian Stretch without having to use a container. So Selenium-based tests are now stable.

The following is an excellent article about using pyenv:
Pyenv – Python Version Management Made Easier

And the following is a link to the GitHub repository:
https://github.com/pyenv/pyenv

I suspect that pyenv is the reason why people are not in a hurry to backport new Python versions to Debian.

How to visually compare two PDF files? (cont’d)

When I asked the above question in a Telegram group, people proposed also other tools, which I am summarizing below.
Amiad Bareli, Amit Aronovitch, Meir Gil and Yehuda Deutsch – thanks.

  1. ImageMagick compare
  2. matplotlib testing framework – supports also PDF:
    >>> import matplotlib.testing.compare
    >>> matplotlib.testing.compare.comparable_formats()
    ['png', 'eps', 'svg', 'pdf']
  3.  pHash – The open source perceptual hash library.

How to visually compare two PDF files?

I have an application written in Python, which uses the ReportLab package for exporting PDF files.

Of course, the application needs to be tested. Among other tests, the PDF export function needs to be tested to ensure that the visual rendering of PDF files did not unexpectedly change.

Since it is possible to create and save an expected-results PDF file using fabricated test data, the above implies the need to compare two PDF files. It turns out that two PDF files created from the same data at two different dates – are different, due to embedded timestamps.

Hence, the need to compare visual renderings of the PDF files. ImageMagick’s convert knows how to convert PDF files into PNG. However, one needs to set the background and remove the alpha channel.

convert knows also to perform bitwise XOR on two image files, but it must be told how to compute the bitwise XOR. This is documented in StackOverflow: Searching for a way to do Bitwise XOR on images.

The script in  https://gitlab.com/TDDPirate/compare_pdfs implements all the above.

Outputting PDFs with your own fonts from your Django application

Do you use reportlab for creating PDF files from your Django application, as instructed in Outputting PDFs with Django?
Do you wish to add your own font files?
Do you need to render Hebrew text?

If yes, the following information will help you.

When installed in a virtualenv, as recommended by good working practices, reportlab searches for font files it uses in your virtualenv/lib/python3.x/site-packages/reportlab/fonts and it is not good idea to add your own font files there.

Instead, add your font files to your_project_root_directory/reportlab_extra_fonts where your_project_root_directory is where your project’s manage.py is located. Add the following to any Python script that uses reportlab (usually views.py), after all regular reportlab import and configuration statements.
# The following configures extra reportlab fonts.
import os
from django.conf import settings
reportlab.rl_config.TTFSearchPath.append(os.path.join(settings.BASE_DIR, 'reportlab_extra_fonts'))
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont
HEBREW_FONT_NAME = 'your_font_name'
pdfmetrics.registerFont(TTFont(HEBREW_FONT_NAME, 'your_font_file_name.ttf'))
# The above configures extra reportlab fonts.

If you want to properly display Hebrew in your PDF file (the probable reason why you needed to add your own fonts in the first place), you need to convert the text yourself from logical ordering to visual ordering, because reportlab (as of version 3.4.0) does not currently process BiDi text. For this purpose, install the python-bidi package in your virtualenv (using pip install python-bidi) and add the following import statement to your views.py script:
from bidi.algorithm import get_display
Now, get_display() will reorder your BiDi text.

The Python module for file type identification, called ‘magic’, is not standardized

I found the hard way that the API exported by the Python module ‘magic’ differs among different versions of the module.

The version installed when installing the Debian package ‘python-magic’ expects the following API:

import magic
mymagic = magic.open(magic.MAGIC_MIME_TYPE)
mymagic.load()
mtype = mymagic.file(inpfname)
print("The MIME type of the file %s is %s" % (inpfname,mtype))

The version installed using ‘pip install python-magic’ expects the following API:

import magic
mymagic = magic.Magic(mime=True)
mtype = mymagic.from_file(inpfname)
print("The MIME type of the file %s is %s" % (inpfname,mtype))

The following code allows the rest of the script to work the same way with either version of ‘magic’:

import magic
def build_magic():
  try:
    mymagic = magic.open(magic.MAGIC_MIME_TYPE)
    mymagic.load()
  except AttributeError,e:
    mymagic = magic.Magic(mime=True)
    mymagic.file = mymagic.from_file
  return(mymagic)
mymagic = build_magic()
mtype = mymagic.file(inpfname)
print("The MIME type of the file %s is %s" % (inpfname,mtype))

PyGuile – Part 5 – Python objects (PyObjects) as proxies for Guile objects (SCMs)

An essential part of integration of Scheme (as implemented in Guile) and Python is allowing Python code to call back code implemented in Scheme. It is also desirable to be able to access data and invoke methods on otherwise-opaque objects created and managed in Scheme. The specifics of opaque object access should also be independent of the specific object system being used in the Scheme-written part of the application.

When implementing in PyGuile the proxying scheme, in which PyObjects serve as proxies for SCMs, the following need to be taken into account:

  • Data type conversion
  • Retention of references as a protection against garbage collection

Data type conversions

A Python callback can be expected to receive positional and keyword arguments, and return a result of any type. Therefore, templates (possibly trivial) for converting between PyObjects (Python data types) and SCMs (Guile data types) need to be associated with each callback.

In the case of objects, we need to associate, with each attribute getter, a template for converting the value from SCM into PyObject. With each attribute setter, a template for converting the value from PyObject into SCM needs to be associated. With each method, which can be invoked on the object, minimum two templates are needed. Three templates should be provided, in case the object needs to be manipulated by an interface, which expects both positional and keyword arguments in the object’s methods.

All the templates needed to work with a SCM (as a callback or as an object) are associated with it when it is wrapped as it is being passed from Guile to Python.

Retention of references

PyObjects, which wrap SCMs, are not expected to be seen by Guile’s garbage collector. Therefore, we need a mechanism for protecting SCMs referenced by PyObjects.

Due to efficiency considrations, Guile’s scm_permanent_object, scm_gc_protect_object or scm_gc_unprotect_object should not be used on every SCM passed to Python. The solution is to create a set object in Guile, protect it using scm_permanent_object (a single call) and then register in it all wrapped SCMs. When a wrapping PyObject’s __del__ function is invoked, one of its actions is to remove the corresponding SCM object from the set. The set will be implemented using a standard hash table, whose keys will be indexes and the data – the SCMs themselves.

PyGuile – Part 4 – Argument and result conversion issues

There is no 1:1 mapping between Scheme and Python data types. As a consequence, there are several cases, in which PyGuile has to guess how would the user like to have the arguments and result of a Python function converted. Instead of guessing, we would like to empower the user to be explicit about the kinds of conversion which he wants.

The following is a census of ambiguous data conversion cases, which I identified.

  1. Scheme pair ->
    • Python 2-Tuple
    • Python 2-List
  2. Scheme list ->
    • Python Tuple
    • Python List
    • Nested tree of pairs (2-Tuples or 2-Lists)
  3. Python 2-Tuple or 2-List ->
    • Scheme pair
    • Scheme list
    • Scheme rational (if the Pythonic data structure consists of two integer values)
  4. Scheme alist (association list) ->
    • Python Dict
    • Python Tuple/List of 2-Tuples
  5. Python string ->
    • Scheme string
    • Scheme symbol
    • Scheme keyword

    Additional considrations:

    • Case sensitivity of symbols and keywords
    • String representation of keywords in Guile has leading dash – to retain or remove it in the Python side of affairs?
  6. Python 1-character string ->
    • Scheme char
    • Scheme string

    Additional considration: utf-8 encoded glyph is a sequence of few characters.

  7. Python int ->
    • Scheme int
    • Scheme bignum
    • Scheme char
  8. Python None -> One of several possible values: ‘(), #f, SCM_EOL, ‘*None* or another custom Scheme value.
  9. Python (),[],{} ->
    • Scheme ‘()
    • SCM_EOL
    • Custom Scheme value
  10. Scheme ‘() ->
    • Python ()
    • Python []
    • Python {}
  11. SCM_EOL ->
    • Python (),[],{}
    • Python None
    • Custom Python value
  12. Scheme rational ->
    • Python Float
    • 2-Tuple of Python Ints
  13. Scheme exact/inexact flag in numerical values – if and how to represent it in the Python side of the application?
  14. Giant data structures with sparse access needs – lazy vs. eager conversion
  15. Exception objects
  16. Objects of certain classes (vectors, ports, functions, images, etc.)

There is also the separate issue of string encoding/decoding, with which we deal by mandating that anything passing between Scheme and Python has to be utf-8 encoded.

One of the goals of PyGuile is to make it efficient to invoke Python library functions from Guile. Therefore, efficiency of conversion of function arguments and results is critical.

When there are no user hints, the following inefficiencies occur:

  1. PyGuile has to make a default (and possibly sub-optimal) choice when encountering one of the above ambiguous cases. Then the script using the data has to reformat it to match the data format to its actual needs.
  2. PyGuile has to identify the data type of each datum. The present implementation does not go into the internal representation of Guile (SCM) and Python (PyObject) objects, therefore PyGuile has to test for various data types one by one, until one of them matches the argument.
  3. Sometimes a Python procedure needs to do no processing on one of its arguments. The argument’s value needs only to be passed around as a pointer, or to be inserted into the right place in a result data structure. In such a case, it is desirable to use the most efficient conversion possible i.e. wrap/unwrap opaque objects. This is a generalization of the case of giant data structures with sparse access needs.

Therefore, when performance is critical, hints from the user would help not only to disembiguate the conversion process but also to speed it up.

The user hints will be implemented as follows.
With each function (Python function invoked from Guile, or Guile function invoked from Python) we associate two (possibly degenerate) signatures. One signature will contain the hints for converting the function’s arguments. The second signature will hint how to convert the function’s result. The signatures are Scheme lists, whose leaf nodes are symbols denoting conversion functions.

Chris Jester-Young, in his answer to my question in Stackoverflow, proposed the following function for traversing two corresponding tree structures, and applying the functions in one of them to data in the other one.

  (define (map-traversing func data)

    (if (list? func)

        (map map-traversing func data)

        (func data)))

Using it requires unquoting. Example:

  (map-traversing `((,car ,cdr) ,cadr) '(((aa . ab) (bb . bc)) (cc cd . ce)))

Our implementation will differ from the above in details, as the signatures’ leaf nodes do not denote proper Scheme functions.

PyGuile – Part 3 – Non-goals

Every project, including software development projects, needs an identity. It needs a definition of its boundaries. It has to be clear about what is inside it and what is outside of it.

Without such a definition, the project would try to be too many things for many people, and as a result, its products would not be really useful for anyone.

A project’s identity makes it easier and faster to make design and trade-off decisions.

Given the above trite introduction, and given that there is a list of goals for the PyGuile project, a list of non-goals is needed as well and here it is.

  1. Theoretical academic purity – attempt to convert every data type from Guile to Python and vice versa, and to support the whole range of values assumed by each data type.
  2. Ability to mix code snippets from both Scheme and Python in the same source code file.
  3. Invocation of machine language libraries (static or DLLs) – for this purpose, there are already existing tools (SWIG and PerlXS).
  4. Framework for making it easy to add support for interoperation with yet another scripting language.

There are also some goals, which are low priority and I do not plan to shed tears if they prove to be impossible to achieve without significant effort:

  1. Thread-safety
  2. Tail recursion support