Bytes, Encoding, And Compression
Bytes
Import bytes:
fromString(text),toString(bytes)fromList(list<int>)- builds bytes from a list of byte values (0-255); rejects out-of-range valuesfromHex(text),toHex(bytes)fromBase64(text),toBase64(bytes)fromBase64Url(text),toBase64Url(bytes)- unpadded URL-safe base64 (RFC 4648 section 5; the variant JWT/JOSE uses). The decoder accepts both padded and unpadded input.concat(left, right)- concatenates twobytesvalues; call repeatedly or uselist.reduceto join many buffers
Bytes values also expose b.toHex(), b.toBase64(), and
b.toBase64Url() as methods, equivalent to the module helpers.
b.toList() returns the byte values as a list<int> (the inverse of
bytes.fromList); b.get(i) / b[i] reads a single byte value.
b.contains(n) tests for a single BYTE VALUE (an int 0-255), not a
sub-sequence: bytes.fromString("hi").contains(104) is true
because 0x68 (h) appears. To search for a byte sub-sequence,
convert with toString and use string contains, or compare
slices.
import bytes;
let data = bytes.fromString("hello");
io.println(bytes.toHex(data));
io.println(bytes.toString(data));
Encoding
Import encoding for transport encodings and HTML/URL escaping. The guiding
rule: encoding.* is text-oriented (works in strings), while the
bytes.* methods are binary-oriented (work in bytes).
Base64:
base64Encode(value),base64Decode(text)- standard Base64. Encode accepts astringorbytes; decode returns astring.base64UrlEncode(value),base64UrlDecode(text)- unpadded URL-safe Base64 (RFC 4648 section 5; matches JWT/JOSE). Encode accepts astringorbytes; decode returns astringand accepts padded or unpadded input.
import encoding;
let token = encoding.base64UrlEncode("user:42"); # string -> string
let back = encoding.base64UrlDecode(token); # back == "user:42"
For binary payloads, decode through the bytes module so the result stays
binary: bytes.fromBase64(s) and bytes.fromBase64Url(s) (the inverses of
b.toBase64() / b.toBase64Url()).
Other base encodings (binary-oriented - decoders return bytes):
base32Encode(value),base32Decode(text)- RFC 4648 standard alphabet, accepts padded or unpadded input on decode.base58Encode(value),base58Decode(text)- Bitcoin/IPFS alphabet (no0,O,I,l); preserves leading zero bytes by emitting leading1s.
base32Encode / base58Encode (like the base64 encoders) accept either a
string or bytes, so they compose with random material from
secrets.randomBytes:
import secrets;
let totpSecret = encoding.base32Encode(secrets.randomBytes(20));
let walletId = encoding.base58Encode(secrets.randomBytes(16));
URL and HTML escaping (all string -> string):
urlEncode(text),urlDecode(text)- percent-encoding for query components.htmlEscape(text),htmlUnescape(text)- escape/unescape HTML entities. UsehtmlEscapeto make untrusted text safe to drop into HTML: it turns<b>into<b>so nothing renders as markup.sanitizeHtml(html)- when you must render untrusted HTML (a rich-text comment, say) rather than show it as text, this strips dangerous content against a safe allow-list (keeps common formatting tags like<b>,<a>,<p>; removes<script>/<style>,on*event handlers, and unsafe URL schemes). Escaping neutralizes all markup; sanitizing keeps a safe subset.
encoding.htmlEscape("<b>hi</b>"); # "<b>hi</b>"
encoding.sanitizeHtml("<b>hi</b><script>x()</script>"); # "<b>hi</b>"
Use this module for transport encodings and escaping, not password hashing or cryptographic operations.
Binary
Import binary for Python struct-style packing of typed values
into a byte buffer:
binary.pack(format, ...values)returnsbytes.binary.unpack(format, data)returns alist<any>of values.binary.unpackNamed(spec, data)returns adict<string, any>;specis alistof{"name": string, "type": string}dicts.binary.size(format)returns the number of bytes the format consumes, useful for buffer sizing.
The first character of the format may set endianness: > big,
< little, ! network (= big), = host native. The default
is big-endian. Per-field codes:
| Code | Type | Bytes |
|---|---|---|
b |
int8 (signed) | 1 |
B |
uint8 | 1 |
h |
int16 | 2 |
H |
uint16 | 2 |
i |
int32 | 4 |
I |
uint32 | 4 |
q |
int64 | 8 |
Q |
uint64 | 8 |
f |
float32 | 4 |
d |
float64 | 8 |
Ns |
N-byte string | N |
Nx |
N pad bytes | N |
A leading digit before a non-s/x code repeats it (4I is
shorthand for IIII); each repeat takes its own positional
argument.
import binary;
import bytes;
let header = binary.pack(">IHB", 0xDEADBEEF, 1024, 7);
io.println(bytes.toHex(header)); /* deadbeef040007 */
let parts = binary.unpack(">IHB", header);
io.println(parts); /* [3735928559, 1024, 7] */
let labelled = binary.unpackNamed([
{"name": "magic", "type": ">I"},
{"name": "size", "type": "H"},
{"name": "version", "type": "B"}
], header);
io.println(labelled["magic"]); /* 3735928559 */
Unsigned 64-bit values whose high bit is set are returned as a
big-int (Int) on unpack so the value round-trips losslessly;
pack accepts either int form.
Compression
Import compress:
gzip(value)gunzip(bytes)
import bytes;
import compress;
let packed = compress.gzip(bytes.fromString("payload"));
io.println(bytes.toString(compress.gunzip(packed)));
Archives
Import archive for zip and tar archive reading and writing.
The 1.4.0 API is eager: readers materialise the full entry list
in memory; writers take a list of entry dicts and return bytes.
A streaming cursor API is queued for a follow-up.
archive.zipRead(bytes)andarchive.zipWrite(entries).archive.tarRead(bytes)andarchive.tarWrite(entries).archive.tarGzRead(bytes)andarchive.tarGzWrite(entries)for the common.tar.gz/.tgzshape.
Each reader returns a list<dict<string, any>> whose dicts
carry name (string), data (bytes), isDir (bool), and
size (int). Each writer accepts the same shape; the data
field may be a string or bytes. Tar writers sort entries by
name so output is deterministic.
import archive;
import bytes;
let raw = archive.zipWrite([
{"name": "hello.txt", "data": "hello world"},
{"name": "nested/inside.txt", "data": "nested"}
]);
let entries = archive.zipRead(raw);
for (e in entries) {
io.println(e["name"] as string);
io.println(bytes.toString(e["data"] as bytes));
}
let tgz = archive.tarGzWrite([
{"name": "config.toml", "data": "key = \"value\""}
]);
let configEntries = archive.tarGzRead(tgz);
Reader errors (corrupt or non-archive bytes) and writer errors
(missing name / data field) throw catchable runtime
exceptions, so callers can wrap untrusted input in try / catch.