Python Course
Session 6a – String Formatting
It is very useful, particularly when printing values, to have good string formatting capabilities. I.e. to have powerful, easy-to-use features for parameterising strings and for then formatting the parameterised values in typically conventional ways, e.g. financial values in a spreadsheet.
As of Python 3.6, there are now three built-in string formatting mechanisms:
•The format method.
•formatted string literals (f-strings), a nicer way of using the format method.
The printf style is very similar to that of the C language. The format method is an improvement but a little more verbose. The new-for-Python-3.6 f-strings is surely the ultimate, and is easily understood for those already familiar with the format method.
In all cases, string formatting consists of embedding format specifiers within a string, where the format specifier allows you to parameterise part of that string. E.g. the following string has some fixed text, and some variable text (myname and myage).
“My name is myname, and I am myage years old.”
The values to be substituted are passed to a formatting function along with the string to be formatted.
Since programming languages don't provide niceties such as italics, there has to be some convention whereby a format specifier can be embedded in a string and recognised as a format specifier. At the same time, it is useful to enhance the format specifier to provide useful mechanisms for formatting (“converting”) the inserted variable values. The mechanism chosen to achieve this differs in each case.
This is semi-deprecated now; it's unclear whether it will at some point be removed from Python. The general format specifier is a multi-character sequence:
%(K)FW.PLT where K, F, W, P, L and T are parts of the format specifier.
% |
| Introduces the start of the format specifier. |
(K) | Optional | Provides a “mapping key” K. Used when the values to be formatted are supplied in a dictionary. E.g. myname |
F | Optional | Conversion flags. Characters from the set: |
W | Optional | Minimum field width, or * to indicate the field width is supplied as one of the values. E.g. 10 |
.P | Optional | Precision, or * to indicate the precision is supplied as one of the values. E.g. 2 |
L | Optional | Length modifier (ignored!). |
T |
| Marks the end of the format specifier and provides the conversion type, one of: |
The % character introduces a format specifier, so it can't be used as a % character. To overcome this problem, the little-used backslash \ is designated as a general string escape character, indicating that the character following it is not to be treated specially. E.g. \% represents a percent character, and \\ represents a backslash character.
Python's basic usage of the printf format is quite similar to that of C:
C | printf("STR%10.2fING", 012.345); | STR 12.35ING |
Python | print("STR%10.2fING" % 012.345) | STR 12.35ING |
Almost identical, except that Python uses the % operator which eliminates the need for an explicit formatting function such as printf.
Examples:
"a:%d b:%.2f c:0x%X” % (44, 5.123, 127) | 'a:44 b:5.12 c:0x7f' |
The de-facto Python string formatter, at least until Python 3.6 brought us f-strings. Here a format string contains a structures called a replacement field, which contains a format specification. The general form of a replacement field is:
{N!C:F}
N | Optional | Field Name. |
!C | Optional | Conversion, one of: r s a |
:F | Optional | Format specifier. |
The format specifier has similarities to the printf style %(K)FW.PLT but with significant enhancements. The general format specifier is a multi-character sequence:
FAS#0WG.PT where F, A, S, #, 0, W, G, P and T are parts of the format specifier.
F | Optional | Fill character. (Only allowed if followed by an align character.) |
A | Optional | Align character. One of: |
S | Optional | Sign. One of: |
# | Optional | Alternate form |
0 | Optional | Sign-aware zero-padding |
W | Optional | Minimum field width. E.g. 10 |
G | Optional | Grouping. One of: |
.P | Optional | Precision. E.g. 2 |
T | Optional | The conversion type. One of: |
Note: To treat { non-specially, you double it: {{
And similarly for }. There are no backslash escapes defined for the characters making up replacement fields.
Examples:
"STR{0:10.2f}ING".format(012.345) | 'STR 12.35ING' |
"STR{jim:x^14,.2f} {0}ING".format(99.99, 1, jim=89012.3450) | 'STRxx89,012.35xxx 99.99ING' |
The syntax of the FAS#0WG.PT format specifier can be formally described using Python's modified BNF (Backus-Naur Form) notation:
format_spec ::= [[fill]align][sign][#][0][width][grouping_option] [.precision][type]
align ::= "<" | ">" | "=" | "^"
width ::= integer
precision ::= integer
type ::= "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%"
New for Python 3.6. A formatted string literal or f-string is in essence an expresssion. It is very very similar to a format string, except that instead of having a field name you have an expression; and you don't explicitly invoke the format method. The general form of the replacement field is:
{E!C:F}
Where E is an expression.
Examples:
f"STR{012.345:10.2f}ING" | 'STR 12.35ING' |
jim = 89012.3450 | 'STRxx89,012.35xxx 99.99ING' |