Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
PyYAML (YAML 1.1 currently) and ruamel.yaml (YAML 1.2) are 2 Python libraries for parsing YAML. PyYAML is more widely used.
PyYAML is preferred over
json
for serialization and deserialization for multiple reasons.- PyYAML is a superset of json.
- PyYAML supports serializing and deserializing set while
json
does not. - YAML is more readable than JSON.
!pip3 install pyyaml
import yaml
doc = """
a: 1
b:
c: 3
d: 4
"""
dic = yaml.load(doc, Loader=yaml.FullLoader)
dic
yaml.dump(dic, open("test.yml", "w"))
yaml.load(open("test.yml"), Loader=yaml.FullLoader)
Read YAML from a String¶
doc = """
-
cal_dt: 2019-01-01
-
cal_dt: 2019-01-02
"""
yaml.load(doc, Loader=yaml.FullLoader)
Read YAML Form File (Single Doc)¶
with open("items.yaml") as fin:
data = yaml.load(fin, Loader=yaml.FullLoader)
print(data)
!cat set.yaml
with open("set.yaml", "r") as fin:
data = yaml.load(fin, Loader=yaml.FullLoader)
data
type(data)
Read YAML (Multiple Docs)¶
Notice that the method yaml.load_all returns a generator!
with open("data.yaml") as f:
docs = yaml.load_all(f, Loader=yaml.FullLoader)
for doc in docs:
for k, v in doc.items():
print(k, "->", v)
Convert generator to a list so that you use it out of the with block.
with open("data.yaml") as f:
docs = list(yaml.load_all(f, Loader=yaml.FullLoader))
docs
for doc in docs:
for k, v in doc.items():
print(k, "->", v)
YAML Dump to String¶
users = [
{"name": "John Doe", "occupation": "gardener"},
{"name": "Lucy Black", "occupation": "teacher"},
]
print(yaml.dump(users))
print(yaml.dump(set([1, 2, 3])))
YAML Dump to File¶
with open("users.yaml", "w") as fout:
yaml.dump(users, fout)
with open("set.yaml", "w") as fout:
yaml.dump(set([1, 2, 3]), fout)
!cat set.yaml
Tokens¶
PyYAML can work with a lower-level API when parsing YAML files. The mehtod scan scans a YAML stream and produces scanning tokens.
The following example scans and prints tokens.
with open("items.yaml") as f:
data = yaml.scan(f, Loader=yaml.FullLoader)
for token in data:
print(token)
Fix Indention Issue¶
PyYAML has an issue of indention currently. For details, please refer to Incorrect indentation with lists #234 .
class Dumper(yaml.Dumper):
def increase_indent(self, flow=False, *args, **kwargs):
return super().increase_indent(flow=flow, indentless=False)
yaml.dump(data, Dumper=Dumper)
Examples¶
with open("ex1.yaml", "r") as fin:
data = yaml.load(fin, Loader=yaml.FullLoader)
print(data)
with open("ex2.yaml", "r") as fin:
data = yaml.load(fin, Loader=yaml.FullLoader)
print(data)
type(data[0]["cal_dt"])
with open("ex3.yaml", "r") as fin:
data = yaml.load(fin, Loader=yaml.FullLoader)
print(data)
with open("ex4.yaml", "r") as fin:
data = yaml.load(fin, Loader=yaml.FullLoader)
print(data)
with open("ex5.yaml", "r") as fin:
data = yaml.load(fin, Loader=yaml.FullLoader)
data
data["y"]
eval(compile(data["y"], "some_file", "exec"))
x = eval("range(10)")
x
import json
json.dumps(list(x))
list(exec(data["y"]))
eval, exec, single, compile
simple 1 line python code which requires you to have every library ready ...
multiple: need a way to reliably run the code and return the result ...
yaml.load("""!!python/list(range(10))""", Loader=yaml.FullLoader)