String Comparison Technique Used by Python -


i'm wondering how python string comparison, more how determines outcome when less (<) or greater (>) sign used.

for instance if put print('abc' < 'bac') true. understand compares corresponding characters in string, unclear why there more, lack of better term, "weight" placed on fact less b (first position) in first string rather fact less b in second string (second position).

from docs:

the comparison uses lexicographical ordering: first first 2 items compared, , if differ determines outcome of comparison; if equal, next 2 items compared, , on, until either sequence exhausted.

also:

lexicographical ordering strings uses unicode code point number order individual characters.

or on python 2:

lexicographical ordering strings uses ascii ordering individual characters.

as example:

>>> 'abc' > 'bac' false >>> ord('a'), ord('b') (97, 98) 

the result false returned a found less b. further items not compared (as can see second items: b > a true).

be aware of lower , uppercase:

>>> [(x, ord(x)) x in abc] [('a', 97), ('b', 98), ('c', 99), ('d', 100), ('e', 101), ('f', 102), ('g', 103), ('h', 104), ('i', 105), ('j', 106), ('k', 107), ('l', 108), ('m', 109), ('n', 110), ('o', 111), ('p', 112), ('q', 113), ('r', 114), ('s', 115), ('t', 116), ('u', 117), ('v', 118), ('w', 119), ('x', 120), ('y', 121), ('z', 122)] >>> [(x, ord(x)) x in abc.upper()] [('a', 65), ('b', 66), ('c', 67), ('d', 68), ('e', 69), ('f', 70), ('g', 71), ('h', 72), ('i', 73), ('j', 74), ('k', 75), ('l', 76), ('m', 77), ('n', 78), ('o', 79), ('p', 80), ('q', 81), ('r', 82), ('s', 83), ('t', 84), ('u', 85), ('v', 86), ('w', 87), ('x', 88), ('y', 89), ('z', 90)] 

Comments