2 - Structured Data
2 - Structured Data
Structured Data
YAML, JSON, and Yet More Markup
Tom Blount
[email protected]
Last lecture
• Unstructured data
• CSV and TSV
• Markup Concepts
• Why do we want machine-readable data?
This lecture
• More machine-readable data!
• YAML
• JSON
• More markup!
• HTML
• ink
• And soon, XML!
YAML
• Originally “Yet Another Markup Language”…
• ..but now “YAML Ain’t Markup Language”
• Widely used in config files
https://yaml.org/
YAML
• Relies on whitespace for structure, rather than
tags
• Easier to read! (Yay?)
• Harder to write! (Boo!)
• Key-value pairs
https://yaml.org/
YAML Example (YAxmple?)
name: Tom
YAML Example (YAxmple?)
person:
name: Tom
role: Teaching
YAML Example (YAxmple?)
person:
name: Tom
role: Teaching
favourite numbers:
- 42
- 13
- 3.141
YAML Example (YAxmple?)
people:
- person:
name: Tom
role: Teaching
favourite numbers:
- 42
- 13
- 3.141
- person:
name: Oli
role: Teaching
favourite food: pizza
YAML
• When/why would I use it? Why wouldn’t I use it?
• Config files
• Passing messages between applications
• Saving (simple) application state
• Spec is someone ambiguous – not all parsers will give the same
result!
• Not as widely used as some other data serialisation formats
---
- name: Tatooine
rotation_period: 23
YAML Exercise orbital_period: 304
diameter: 10465
climate: “arid”
gravity: 1 standard
terrain: desert
• Here’s some YAML: surface_water: 1
population: 200000
1. Which planet has the - name: Alderaan
rotation_period: 24
largest population? orbital_period: 364
diameter: 12500
climate: “temperate”
2. What’s the most gravity: 1 standard
common climate? terrain: grasslands, mountains
surface_water: 40
population: 2000000000
3. Why isn’t this valid - name: Yavin IV
YAML? rotation_period: 24
orbital_period: 4818
diameter: 10200
climate: “temperate, tropical”
gravity: 1 standard
terrain: jungle, rainforests
surface_water: 8
population: 1000
---
- planet:
name: Tatooine
rotation_period: 23
YAML Exercise orbital_period: 304
diameter: 10465
climate: “arid”
gravity: 1 standard
terrain: desert
surface_water: 1
• Parent can’t have a value population: 200000
- planet:
and children name: Alderaan
rotation_period: 24
orbital_period: 364
• And… diameter: 12500
climate: “temperate”
• …I used tabs, not spaces! gravity: 1 standard
terrain: grasslands, mountains
surface_water: 40
population: 2000000000
- planet:
name: Yavin IV
rotation_period: 24
orbital_period: 4818
diameter: 10200
climate: “temperate, tropical”
gravity: 1 standard
terrain: jungle, rainforests
surface_water: 8
population: 1000
JSON
• JavaScript Object Notation
• 3 important concepts
• Objects
• Lists
• Values
• Objects, but no classes!
https://www.json.org/
JSON
https://www.json.org/
JSON Example
{
"name": "Tom"
}
JSON Example
{
"person": {
"name": "Tom",
"role": "Teaching",
"lucky": [42, 13, 3.141]
}
}
JSON Example
{
"people":[
{
"name": "Tom",
"role": "Teaching",
"lucky": [42, 13, 3.141]
},
{
"name": "Oli",
"role": "Teaching",
"favourite food": "Pizza"
},
]
}
JSON Example
{
"people":[
{
"name": "Tom",
"role": "Teaching",
"lucky": [42, 13, 3.141]
},
{
"name": "Oli",
"role": "Teaching",
"favourite food": "Pizza"
},
]
}
JSON
• How to use it?
• Java
• Python
• Javascript
• Plenty of languages!
JSON - Java
import json
name = myJSON["person"]["name"]
JSON - JS
name = myJSON["person"]["name"];
JSON
• When/why would I use it?
• Pretty much whenever you’re passing data across the web
• (e.g. API calls)
• Passing data between programs/languages
Death Star
JSON Exercise
• https://swapi.dev/api/starships/?format=json
1800000
JSON Exercise
• https://swapi.dev/api/starships/?format=json
Title Quote
Story Paragraph
\section{My Document}
Welcome to my document!
\begin{figure}
\includegraphics{frog.jpg}
\caption{Ta-dah! A picture of a frog}
\end{figure}
\end{document}
HTML Exercise
• What’s changed?
<!DOCTYPE html>
<html>
<body>
<section>
<h1>My Document</h1>
<p>Welcome to my document!</p>
<p>In this document is a picture of a <strong>frog</strong>...</p>
<section>
<h2>And now: a Figure</h2>
<figure>
<img src="frog.jpg">
<figcaption>Ta-dah! A picture of a frog</figcaption>
</figure>
</section>
</section>
</body>
</html>
HTML & YAML?
• Jekyll – static site generator
• Liquid – templating system
---
food: pizza
salutation:
--- Hi
--- food: bread
salutation: Hello!
<!DOCTYPE
--- ---
html>
<html> food: fish & chips
<body> salutation:
<!DOCTYPE html> Howdy
<h1>{{---
<html> page.food }}</h1>
<p>{{ page.salutation }}! This page is all about {{ page.food }} and...</p>
<body>
<!DOCTYPE
... <h1>{{ html> }}</h1>
page.food
<html>
<p>{{ page.salutation }}! This page is all about {{ page.food }} and...</p>
... <body>
<h1>{{ page.food }}</h1>
<p>{{ page.salutation }}! This page is all about {{ page.food }} and...</p>
...
ink
• Markup for web-based interactive fiction
• Effectively, a different markup language for
hypertext
• Integrates with different game engines
(including Unity)
• “Knots”
• “Diverts”
• “Choices”
ink
Ink – “Knots”
• Labels/sections
• All content beneath the knot belongs to that knot
• (Until the next knot)
{"title":“My
Story","data":{"stitches":{"onceUponATime":{"content":["Once upon a
time...",{"option":"","linkPath":null,"ifConditions":null,"notIfConditi
ons":null},{"pageNum":1}]}},"initial":"onceUponATime","optionMirroring"
:true,"allowCheckpoints":false,"editorData":{"playPoint":"onceUponATime
","libraryVisible":false,"authorName":“Tom","textSize":0}},"url_key":15
0784}
Summary
• Some more types of structured data, that add a little more structure
• Some more types of markup too!
Next Time
• XML!
• Tuesday Lab session
• No lecture on Friday or Monday!