python prefix for binary
0b
python prefix for octal
0o
python prefix for hexadecimal
0x
converting an int into a string representation of one of the other base
bin(25)
oct(25)
hex(25)
ord()
code point value for any character
memory and storage size representation
reported in bytes (octet) NB
network speed representation
reported in bits (binary) b
kilo (‘k’)
10^3 = 2^10
mega (‘M’)
10^6 = 2^20
giga (‘G’)
10^9 = 2^30
tera (‘T’)
10^12 = 2^40
peta (‘P’)
10^15 = 2^50
ASCII
method for popular non-unicode character encoding
128 character based on 7-bit encoding
compact, but can only encode a small number of characters
UTF-32
method to represent each unicode code point value directly
32-bit encoding
can encode all unicode characters but documents are often bigger than they need to be
single-byte encoding
ISO-8859
built on top of ASCII to include extra 128 characters
can cover many orthographies but not big ones (large alphabet) such as Chinese and also can’t encode documents with more than one language
variable-width encoding
UTF-8
UTF-16
encode the code points using a variable number of ‘code units; of a fixed size
UTF-8 code units = 8 bits = 1 byte
UTF-16 code units = 6 bits = 2 bytes
UTF-8
variable-width encoding
compatible/superset with ASCII
character boundaries are easily locatable
UTF-16