Monday, April 20, 2009

Understanding Zope acquisition

Disclaimer: I'm not a Zope expert. Whatever I write down here is the result of some experiments, and hence may not capture the exact rules. Feel free to correct me!

These are some notices I wrote for myself when I had to fix some Zope-based site, but then, why not share it... It explains some basics and, most importantly, some not-well-documented (or rather, not-well-explained) aspects of Zope acquisition.

How it works

Acquisition (aq from now) is basically an extension of the Python language. Acquisition means inheriting object attributes from the "parent" object, which in turn possibly aq-es from its parent, and so on. This is independent of class inheritance, which is about the type (class) hierarchy. Aq isn't omnipresent; it only kicks in when you fiddle with objects which were specifically made to be aq-aware or were get from a such object. So, what's the "parent" object we inherit from? Basically (and said a bit simplified) it's the object whose attribute the current object is. So, in object a.b.c, "c" will apparently have a "foo" attribute if any of "a" or "b" or "c" has a such attribute. The attribute is first searched in "c", and if it's not there then in "b", and if it's not there either then in "a". But how does the "c" object know what its parent object is? The raw "c" object itself doesn't know that. But if "a" is aq-aware, it will return "b" wrapped (encapsulated) into another object, a so called aq wrapper, and b.c will return "c" wrapped too. So in

o = a.b.c
print o.foo
# This is actually the same as print a.b.c.foo,
# but I felt it's easier to get the point this way...

we don't just get the "foo" of the raw "c" object, but of the aq wrapper. The wrapper mostly acts like the wrapped object inside it (so you mostly won't notice that you are working with an impostor), but it modifies the attribute reading semantics to implement aq. We could depict the result of a.b.c (stored in o) as:

    a    <-- result of expression a
    ^  
    |  
{b, P}   <-- result of expression a.b
     ^ 
     | 
 {c, P}  <-- result of expression a.b.c

Legend: {} symbolizes a wrapper, and inside it the first item is the wrapped
        object, and P is the pointer to the aq parent. Lower-case letters
        are the raw (not wrapped) objects.

So as you see the objects were wired together behind the scenes. The value of o knows how it was get (a -> b -> c). Note that the aq hierarchy corresponds to the way we accessed "c", so the same object may belongs to different aq hierarchies depending on how we accessed it. With other words, the same object may behave differently (i.e., has different attributes) depending on the path we reach it. Like, if the same "c" object could be reached as u.v.w.c, the resulting object would acquire from "u", "v" and "w" (assuming at least "u" was aq-aware), yet it's the same "c" (kind of...).

Now, the tricky part... Let's say "a" has non-aq-ed (i.e., plain, direct) "x" and "foo" attributes, and that "b" has a non-aq-ed "foo" attribute too, but not "x" attribute. What does a.b.x.foo return, the "foo" of "a" or of "b"? (Note that "x" is aq-ed from "a"... a bit odd for those who didn't work with aq yet.) It will be the "foo" of "a". This rule is often referred as "containment before context". In this case, "x" is contained in "a" ("a" is the parent), hence according to the containment hierarchy we could only aq it from "a" (the only parent in this case), and thus we first search "foo" in "x" and then in "a", and there we find it. But if "a" had no "foo" attribute, then we were continue searching in the "context" too, according which the parent of "x" is "b", and hence find "foo" in "b". But how is this implemented? How is "container" VS "context" stored? Well, an aq wrapper points only to a single parent object (with its aq_parent field), there is no separate container parent and context parent fields or like. But, as the wrapped object itself is possibly an aq wrapper, we can still have two parents for some nodes in the aq hierarchy:

        a <-.   <-- result of expression a
        ^   |
        |   |
    {b, P}  |   <-- result of expression a.b
         ^  |
         |  |
{{x, P}, P} |   <-- result of expression a.b.x; It contains two parents!
     |      |
     \______/

{x, P} was created when "a" returned "x" during the evaluation of a.b.x. So internally we had an a.x expression evaluated here, whose result is of course {x, P}, with P pointing to "a". So, we have the two P-s, but what about the priorities? As you know by now, when an attribute is searched, the aq wrapper ({...} things above) first searches in the wrapped object (the first item inside the {...}-s), and only then in the parent (the second thing inside the {...}); that's why real attributes has priority over aq-ed ones. In this case, the wrapped object itself is an aq wrapper, {x, P}. So, we will search "foo" in {x, P} first, which in turn will search it in "x" first, then in the parent, "a". Only after that would it search "foo" in the 2nd P in {{x, P} P}, which is "b", but in this case it won't as we already found the attribute in "a". So this is why the true container ("a") had priority over the context ("b").

Let's see some Python code to build the above hierarchy:

from Acquisition import Implicit

class Dummy(Implicit):  # extending Implicit turns on aq
   pass

a = Dummy()
b = Dummy()
x = Dummy()
a.b = b
a.x = x
a.foo = "a's foo"
b.foo = "b's foo"

print a.b.x.foo  # says "a's foo"

a.b.x above evaluates to the wrapper graph depicted earlier. Note that the top-level variables "a", "b" and "x" will remain raw objects, not aq wrappers; wrappers appear only in the results of attribute readings. But also note that although the raw objects don't do any aq, they are still not just plain old Python objects, but aq-aware ones, as they return aq wrappers for attribute readings.

The special attributes of aq wrappers ({...})

All attributes of the wrapped objects are available, plus some extra ones, and here I will list some of those. (The examples are based on the earlier example.)

  • aq_parent: Points to the wrapped parent object, unless the parent object is the "root", in which case it's not wrapped:

    {b, P} --> a
    {{x, P}, P} --> {b, P}
    
  • aq_self: The wrapped object, which is itself maybe a wrapper:

    {b, P}} --> b
    {{x, P}, P} --> {x, P}
    
  • aq_base: The innermost wrapped object without the wrapping:

    {b, P} --> b
    {{x, P}, P} --> x
    
  • aq_inner: The innermost wrapped object with the wrapping:

    {b, P} --> {b, P}
    {{x, P}, P} --> {x, P}
    
  • aq_chain: The list of (wrapped) parents starting with the children:

    {{x, P}, P} -> [{{x, P}, P}, {b, P}, a]
    
  • __of__(self, parent): creates an aq wrapper around self, and return it:

        a
        ^
        |
    {b, P}   <-- this is the result of b.__of__(a) 
    Note that "b" is possibly not even an attribute of "a", yet aq will treat "a" as the container of "b" when we use the result of this __of__ method call.

Roman numbers in Java

Surely this has several implementations around, but... someone may still wants to copy-paste this. It only works up to 3999, as after that non-US-ASCII characters are needed. Trivial to extend it for that however... Public Domain:

private static final char[] UPPER_ROMAN_DIGITS = new char[] {
    'I', 'V',
    'X', 'L',
    'C', 'D',
    'M'
}; 

private static final char[] LOWER_ROMAN_DIGITS = new char[] {
    'i', 'v',
    'x', 'l',
    'c', 'd',
    'm'
}; 

/**
 * Converts a number to upper-case Roman number, like XVI; up to 3999.
 * @throws IllegalArgumentException if the number is not in the [1..3999]
 *    range.
 */
public static String toUpperRomanNumber(int n) {
    return toRomanNumber(n, UPPER_ROMAN_DIGITS);
}

/**
 * Converts a number to lower-case Roman number, like xvi; up to 3999.
 * @throws IllegalArgumentException if the number is not in the [1..3999]
 *    range.
 */
public static String toLowerRomanNumber(int n) {
    return toRomanNumber(n, LOWER_ROMAN_DIGITS);
}

private static String toRomanNumber(int n, char[] romanDigits) {
    // We fetch the decimal digits from right to left.
    // The res buffer will contain the Roman number *backwards*, and thus it
    // also will contain the Roman "digits" backwards, like 7 will be "IIV".
    // At the very end the buffer is reversed.
    
    if (n > 3999) {
        throw new IllegalArgumentException("toRomanNumber only supports "
                + "numbers in the [1..3999] range, but the number was "
                + n + ".");
    }
    
    StringBuilder res = new StringBuilder();
    int base = 0;
    while (n != 0) {
        int digit = n % 10;
        n /= 10;
        if (digit != 0) {
            switch (digit) {
            case 3:
                res.append(romanDigits[base]);
                // falls through
            case 2:
                res.append(romanDigits[base]);
                // falls through
            case 1:
                res.append(romanDigits[base]);
                break;
                
            case 4:
                res.append(romanDigits[base + 1])
                        .append(romanDigits[base]);
                break;
                
            case 8:
                res.append(romanDigits[base]);
                // falls through
            case 7:
                res.append(romanDigits[base]);
                // falls through
            case 6:
                res.append(romanDigits[base]);
                // falls through
            case 5:
                res.append(romanDigits[base + 1]);
                break;
                
            case 9: 
                res.append(romanDigits[base + 2]);
                res.append(romanDigits[base]);
                break;
                
            default:
                throw new BugException("Unexpected branch");
            }
        }
        base += 2;
    }
    return res.reverse().toString();
}

Latin numbers a.k.a. alpha numbers a.k.a. alphabetical Excel column naming/numbering in Java

There is that kind of odd "number-system" or number formatting that Excel uses for the columns. I believe this is also used for "numbering" appendices in books. At the first glance you will think it's just a 26-based number system, but it's not, as this system has no 0 digit, which means you will miserably fail with the good-old division-and-modulo-with-the-base trick. Anyway, here it is, methods for converting int to Latin number, without upper limit. Watch out, in this implementation A is 1, not 0. Public Domain:

/**
 * Converts a number to upper-case Latin (alpha) number, like
 * A, B, C, and so on, then Z, AA, AB, etc. No upper limit.
 */
public static String toUpperLatinNumber(int n) {
    return toLatinNumber(n, 'A');
}

/**
 * Converts a number to lower-case Latin (alpha) number, like
 * a, b, c, and so on, then z, aa, ab, etc. No upper limit.
 */
public static String toLowerLatinNumber(int n) {
    return toLatinNumber(n, 'a');
}

private static String toLatinNumber(final int n, char oneDigit) {
    if (n < 1) {
        throw new IllegalArgumentException("Can't convert 0 or negative "
                + "numbers to latin-number: " + n);
    }
    
    // First find out how many "digits" will we need. We start from A, then
    // try AA, then AAA, etc. (Note that the smallest digit is "A", which is
    // 1, not 0. Hence this isn't like a usual 26-based number-system):
    int reached = 1;
    int weight = 1;
    while (true) {
        int nextWeight = weight * 26;
        int nextReached = reached + nextWeight;
        if (nextReached <= n) {
            // So we will have one more digit
            weight = nextWeight;
            reached = nextReached;
        } else {
            // No more digits
            break;
        }
    }
    
    // Increase the digits of the place values until we get as close
    // to n as possible (but don't step over it).
    StringBuilder sb = new StringBuilder();
    while (weight != 0) {
        // digitIncrease: how many we increase the digit which is already 1
        final int digitIncrease = (n - reached) / weight;
        sb.append((char) (oneDigit + digitIncrease));
        reached += digitIncrease * weight;
        
        weight /= 26;
    }
    
    return sb.toString();
}