PyGuile – Part 4 – Argument and result conversion issues

There is no 1:1 mapping between Scheme and Python data types. As a consequence, there are several cases, in which PyGuile has to guess how would the user like to have the arguments and result of a Python function converted. Instead of guessing, we would like to empower the user to be explicit about the kinds of conversion which he wants.

The following is a census of ambiguous data conversion cases, which I identified.

  1. Scheme pair ->
    • Python 2-Tuple
    • Python 2-List
  2. Scheme list ->
    • Python Tuple
    • Python List
    • Nested tree of pairs (2-Tuples or 2-Lists)
  3. Python 2-Tuple or 2-List ->
    • Scheme pair
    • Scheme list
    • Scheme rational (if the Pythonic data structure consists of two integer values)
  4. Scheme alist (association list) ->
    • Python Dict
    • Python Tuple/List of 2-Tuples
  5. Python string ->
    • Scheme string
    • Scheme symbol
    • Scheme keyword

    Additional considrations:

    • Case sensitivity of symbols and keywords
    • String representation of keywords in Guile has leading dash – to retain or remove it in the Python side of affairs?
  6. Python 1-character string ->
    • Scheme char
    • Scheme string

    Additional considration: utf-8 encoded glyph is a sequence of few characters.

  7. Python int ->
    • Scheme int
    • Scheme bignum
    • Scheme char
  8. Python None -> One of several possible values: ‘(), #f, SCM_EOL, ‘*None* or another custom Scheme value.
  9. Python (),[],{} ->
    • Scheme ‘()
    • SCM_EOL
    • Custom Scheme value
  10. Scheme ‘() ->
    • Python ()
    • Python []
    • Python {}
  11. SCM_EOL ->
    • Python (),[],{}
    • Python None
    • Custom Python value
  12. Scheme rational ->
    • Python Float
    • 2-Tuple of Python Ints
  13. Scheme exact/inexact flag in numerical values – if and how to represent it in the Python side of the application?
  14. Giant data structures with sparse access needs – lazy vs. eager conversion
  15. Exception objects
  16. Objects of certain classes (vectors, ports, functions, images, etc.)

There is also the separate issue of string encoding/decoding, with which we deal by mandating that anything passing between Scheme and Python has to be utf-8 encoded.

One of the goals of PyGuile is to make it efficient to invoke Python library functions from Guile. Therefore, efficiency of conversion of function arguments and results is critical.

When there are no user hints, the following inefficiencies occur:

  1. PyGuile has to make a default (and possibly sub-optimal) choice when encountering one of the above ambiguous cases. Then the script using the data has to reformat it to match the data format to its actual needs.
  2. PyGuile has to identify the data type of each datum. The present implementation does not go into the internal representation of Guile (SCM) and Python (PyObject) objects, therefore PyGuile has to test for various data types one by one, until one of them matches the argument.
  3. Sometimes a Python procedure needs to do no processing on one of its arguments. The argument’s value needs only to be passed around as a pointer, or to be inserted into the right place in a result data structure. In such a case, it is desirable to use the most efficient conversion possible i.e. wrap/unwrap opaque objects. This is a generalization of the case of giant data structures with sparse access needs.

Therefore, when performance is critical, hints from the user would help not only to disembiguate the conversion process but also to speed it up.

The user hints will be implemented as follows.
With each function (Python function invoked from Guile, or Guile function invoked from Python) we associate two (possibly degenerate) signatures. One signature will contain the hints for converting the function’s arguments. The second signature will hint how to convert the function’s result. The signatures are Scheme lists, whose leaf nodes are symbols denoting conversion functions.

Chris Jester-Young, in his answer to my question in Stackoverflow, proposed the following function for traversing two corresponding tree structures, and applying the functions in one of them to data in the other one.

  (define (map-traversing func data)

    (if (list? func)

        (map map-traversing func data)

        (func data)))

Using it requires unquoting. Example:

  (map-traversing `((,car ,cdr) ,cadr) '(((aa . ab) (bb . bc)) (cc cd . ce)))

Our implementation will differ from the above in details, as the signatures’ leaf nodes do not denote proper Scheme functions.

PyGuile – Part 3 – Non-goals

Every project, including software development projects, needs an identity. It needs a definition of its boundaries. It has to be clear about what is inside it and what is outside of it.

Without such a definition, the project would try to be too many things for many people, and as a result, its products would not be really useful for anyone.

A project’s identity makes it easier and faster to make design and trade-off decisions.

Given the above trite introduction, and given that there is a list of goals for the PyGuile project, a list of non-goals is needed as well and here it is.

  1. Theoretical academic purity – attempt to convert every data type from Guile to Python and vice versa, and to support the whole range of values assumed by each data type.
  2. Ability to mix code snippets from both Scheme and Python in the same source code file.
  3. Invocation of machine language libraries (static or DLLs) – for this purpose, there are already existing tools (SWIG and PerlXS).
  4. Framework for making it easy to add support for interoperation with yet another scripting language.

There are also some goals, which are low priority and I do not plan to shed tears if they prove to be impossible to achieve without significant effort:

  1. Thread-safety
  2. Tail recursion support

PyGuile – Part 2 – Design Issues

While working on the PyGuile, I identified the following design issues.

  1. The data type trees of Scheme and Python do not have an 1:1 correspondence.
    • Do we want to convert a Scheme list into a Python Tuple or a Python List?
    • How about an alist (associative list) – should be a Python List of 2-tuples or a Python Dict?
    • And in the other direction – do we want to convert a Python string into a Scheme string, symbol or keyword?
  2. API for adding plugins which convert between Guile and Python representations of useful data types (such as file handles, images or Berkeley sockets).
  3. How do we want to pass large data structures – convert them immediately, or employ lazy conversion (convert an element only when it is requested)? If we employ lazy conversion, how do we implement the associated bookkeeping? See more about this below.
  4. How do we deal with the different garbage collection regimes of Guile and Python? In particular, how do we make SCM objects owned by Python objects known to the Guile garbage collector?
  5. How will we support Unicode? Bear in mind that we want to minimize manipulations of long text strings.
  6. How to allow each scripting language to seamlessly invoke functions in the other scripting language?

The problem of lack of 1:1 correspondence will be dealt with as follows.

A standard conversion convention, which will work for the overwhelming majority of cases, will be employed. Functions, which have special needs, will have their argument conversions specified by means of a suitable tree-structured template.

When passing a data structure (or object) created in language A to language B, the following cases can happen:

  1. Opaque pointer – B only passes it around. A performs all processing and B just holds the pointer for future reference.
  2. B accesses a single element (or small number of elements) in the data structure.
  3. B loops over all elements of the data structure.
  4. B needs arbitrary access to several elements of the data structure (example: image processing).

Those cases can be dealt with as follows:

  • Case 1 can be handled by wrapping a language A pointer by a language B object, which carries opaque data around.
  • Cases 2,3 can be dealt by means of custom data access procedures (such as Python’s __getitem__()). An element will be converted only when it is actually requested. Elements in nested data structures can be dealt with as in case 1.
  • Case 4 can be handled by implementing a mechanism for plugging in and registering custom conversion functions for specific data types.

In practice, the most tough design issue, which I identified so far, is the management of the SCM objects owned by Python objects.

When a SCM object is assigned to an attribute of a Python object, some registration mechanism needs to
be invoked so that the SCM object can be reclaimed by the Guile garbage collector if the Python object goes out of scope. The registration mechanism needs also to take care of marking the SCM objects while they are owned by a living Python object.

PyGuile – Part 1 – Using Python libraries in Guile (a Scheme implementation) scripts

For long time I have dreamt of invoking Python libraries from scripts written in Scheme. The reason for this is to be able to enjoy the fantastically rich control structures possible in Scheme, yet use familiar libraries to accomplish useful actions, some of which are unavailable in SLIB and other Scheme libraries.

Now at last I am working on realizing this dream. The Scheme implementation being used is version 1.6 of Guile and the Guile extension being developed embeds a Python 2.4 interpreter. In the future, more recent versions of Guile and Python will be used.

The goals of the project are:

  1. Make it easy to invoke Python libraries from Guile.
  2. The integration between Python and Guile is to be seamless.
  3. The architecture of the implementation shall enable optimizations for efficient runtime behavior.

To accomplish those goals, it is necessary to:

  1. Convert primitive Scheme data types (integers, reals, Booleans, strings, lists) into the corresponding Python data types, and vice versa.
  2. Be able to invoke functions defined in one language from the other language. This has to be bidirectional in order to support callbacks.
  3. Be able to pass around pointers to objects (as opaque values) and invoke methods over them.
  4. Have efficient transfer of control and data between both languages.
  5. Deal with different garbage collection conventions in both environments.
  6. Be able to optimize code for a particular pair of language runtime systems.
  7. Nice to have: support for recursion, especially tail recursion.
  8. Nice to have: thread-safety.

It is envisioned that the software developed in this project will be part of a larger system, which will allow more scripting languages to interoperate with Guile and with each other.

There is another project – Schemepy – which embeds a Scheme interpreter in Python scripts.  This project has different focus and it essentially allows Scheme to be used for those parts of a project, in which its strengths are especially important.

Perfect ad network

I saw somewhere the following question and decided to share the results of some of the resulting firings of my neurons:

In your opinion, how would a perfect ad network look like?

This blog and my entire Web site is currently working with Google AdSense ad network. It appears to do pretty good job, but the following could improve it:

  • Option to forward all ad requests or part of them to the Webmaster for vetting. Sometimes it is obvious to the Webmaster that an ad will not be effective on a particular Web site.
  • Option for the Webmaster to specify a recommended location for an ad in his Website, overriding the ad network’s choice.
  • Better reporting to the Webmaster of clickthroughs on specific ads in specific locations, to help the Webmaster optimize his overriding decisions.
  • Effective mechanism for ad publisher to complain that clickthroughs from a particular Web site yielded no income and should not be compensated for – this will ensure that the Webmasters are kept honest.
  • Effective mechanism for Webmasters to know whether ad publishers abused the mechanism of repudiating clickthroughs as yielding no income – to keep them honest, according to the Webmasters’ point of view.

Abortions, no questions asked

The current law in Israel is that women are not automatically entitled to have abortions.  They must apply for approval by a committee.  Unmarried women and women below and above certain range of ages get automatic approval.  Married women are allowed to have abortion only if there is a medical or another legally recognized reason for this.

In the wake of the recent tragedies, in which 4 year old children were murdered by their mothers and/or grandfathers, the politicians are clamoring for something to be done.  A plan was indeed put together to improve monitoring of families having children at those ages.

May I suggest another solution to the problem: allow any pregnant woman, regardless of her marital, health or sociological status, to have an abortion, no questions asked, if she does not feel like having the baby.  Since the goal is that only women, who really want babies, would have them, there should be no stigma attached to having an abortion.  At least not beyond the existing stigma of not wanting to have children.

Abortion is better than killing a child or raising him, like an unwanted child, to be a criminal.

Blog Day 2008

The 4th Blog Day was held yesterday, but I was too busy to notice this, so my contribution was postponed to today.

More information about the yearly Blog Day.

I apologize to my fellow Hebrew language blog writers for not mentioning any Hebrew language blog this time. To my defense, I’ll point out the common denominator of the following blog recommendations. They all deal with various aspects of bullshit. Good and bad stories about disaster recovery (some of the bad cases are accompanied by bullshit), bullshit as “security theater” (which is a nefarious kind of bullshit), IT project failures (again, frequently due to bullshit), amusing software related bullshit stories, bullshit gadget designs, and bullshit in general.

  • Amanda Ripley’s Blog
    Amanda Ripley wrote the book “The Unthinkable: Who Survives When Disaster Strikes – And Why” about behavior of people, who were caught in natural and human-made disasters. Her blog expands upon the theme of disaster recovery.
  • Schneier on Security
    Bruce Schneier is the foremost computer security expert in the world, and is also the author of the book “Applied Cryptography”. His theme is that security is a system property. No technological means assures security if there is a security vulnerability in the rest of the system. He blogs, among other things, about stupid security policies.
    Not related to security, he blogs also about his hobby – squids.
  • IT Project Failures
    Informs the readership about big IT project failures and their causes.
  • Worse Than Failure
    Amusing stories about the foibles of less than top notch computer professionals and software developers.
  • Commonsense Design
    This blog has the slogan: “Nathan Zeldes writes on the Good, the Bad and the Ugly of everyday product design”. I cannot improve upon this slogan.
  • The War On Bullshit
    Opinions, which are sometimes politically incorrect, about various bullshit attitudes in sociology and politics.

How to get rid of gadget chargers and power supplies

The Slashdot question about this topic reminds me of the overwhelming array of 9 chargers and low voltage power supplies which power my equipment.

In the comments, it was mentioned that a connector manufacturer employs a lobbyist to foil any attempt to mandate standardization of the connectors and low voltage power supplies.

Another comment mentioned the 12V standard car cigarette lighters. This standard is currently usable only in cars, and most gadgets are not designed to be powered from them.

Solution?

  1. Manufacture a splitter which allows 5-6 plugs (shaped like car cigarette lighter plugs) to receive power at the same time.  Those splitters are meant for use at homes,  and will allow several gadgets to be powered/charged at the same time.
  2. Manufacture a DC to DC converter for each gadget, to allow most gadgets to be powered from standard car cigarette lighters. It may be possible to miniaturize those converters, as they don’t require a 110V/220V step down transformer.  People will prefer to carry those converters with their gadgets, rather than the bulkier manufacturer-provided power supplies.
  3. Manufacture a car cigarette lighter lookalike socket, powered by a step down transformer, for use at homes.  This will allow homes to provide the same power connectors as cars.

Then, car cigarette lighter sockets will become the de-facto standard power supply for gadgets. Those three products will solve the chicken-and-egg problem of introducing a standard power supply for gadgets, which require DC power.

A possibly systematic flaw in Israeli defense strategy

One of the constants in Israeli history is that Israel wins wars but loses in the post-war diplomatic front, so Israel doesn’t succeed in converting its war victories into everlasting peace with its neighbors.

Why is this so? Is it because the Israeli leaders are so preoccupied with the daily tasks of managing Israel, that they have no time to plan ahead? Is it because no one thought about the future?

About the value of planning ahead, Eliot A. Cohen wrote that two great war statesmen planned ahead and defined what are their war goals. They knew what kind of peace they want to have. One of them (Abraham Lincoln) achieved it, and the other’s (Winston Churchill) opinions stood the test of time.

Two other war statesmen won wars but did not win everlasting peace. One of them was David Ben-Gurion, who failed to define what he wants to accomplish in the 1948 War of Independence, and toward what kind of peace to strive. One of the consequences is that Israel did not have peace with any of its neighbors until the 1979 Egyptian-Israeli peace treaty.

This pattern, of fighting and winning but without planning ahead the kind of desirable victory, continued in the Arab-Israeli wars since 1948, in spite of journalists having spent lots of ink writing about it and heavily criticizing the leaders for this shortcoming. The only exception, of which I am aware, is the 1982 Lebanon War (now known as the First Lebanon War), whose goals were defined. However, this exception proves the rule, because those goals were not consistently pursued due to political pressure from various leaders and other reasons.

Now I suspect that the consistent failure to define war goals was not an oversight by overwhelmed Israeli leaders, but part of a systematic problem. To define war goals and to get most of the Israelis to agree with them, one needs first to define what kind of Israel one wants and get this vision accepted by the overwhelming majority of the Israelis. If we want to emphasize territory annexion, we need one set of war goals. If we want to emphasize human rights, we need another set of war goals.

The systematic problem is that Israelis cannot agree what kind of Israel they want. There is a conflict between the secular (who want a state of the Jews) and the religious (who want a Jewish state). There is also a conflict between the Settlers (who want to annex as much land as the world will let them) and the Leftists, who care about the human rights of Palestinians living in land currently controlled by Israel.

A consequence of the internal conflicts is that it is impossible for any Israeli leader to define, articulate and consistently pursue any coherent set of war goals. At least if he does not want to commit political suicide (Ariel Sharon at 1982, anyone?) or reap lots of poisonous criticism from people who don’t agree with his vision of Israel and the war goals to be pursued.

Bibi Netanyahu’s Incredibly Simple Basic Approach

Today I decided at last to have a look at Bibi Netanyahu’s blog, whose existence I know about for a while.
The blog is written in Hebrew.
He considers the problems of the Israeli educational system, which has been deteriorating for several years by now.
His suggestion – apply the same basic approach, which he successfully applied when he was Minister of Finance and got the Israeli economy to improve in a big way.
What basic approach?
Find which countries have the most successful policies (then – economics, now – educational). Then learn from their experience.
All the rest are mere details.

This approach is also politically feasible:

  • It is easier to sell a new policy to other politicians and to the constituency if you show that it worked beautifully in country A and country B.
  • In the specific case of the educational system, the solution is to get better people in, less suitable people out. Fortunately, it is not so difficult to do so in time scale of 10 years, thanks to the big employee turnover in the system.