Next steps
I've released, or updated, a series of tools for working with, and converting between, Eatdrinkfeelgood 1.1 and (still experimental) 2.0 documents.
-
erdfg.py
Python classes for parsing and rendering Eatdrinkfeelgood 2.0 (e(r)dfg) documents
-
eatdrinkfeelgood-1.1-to-2.0.xsl
XSLT stylesheet to transform an Eatdrinkfeelgood 1.1 XML document into an Eatdrinkfeelgood 2.0 text/N3 document.
-
eatdrinkfeelgood-1.1-to-xhtml.xsl
XSLT stylesheet to transform an Eatdrinkfeelgood 1.1 XML document into an XHTML document. Updated some of the logic to handle namespace issues properly so that it will work with (version 1.1) documents generated by the e(r)dfg Python tools.
At some point there will probably be an XML serialization for e(r)dfg documents (assuming they aren't already natively XML) and all of the
1.1-to-whatever
stylesheets will be updated but for the time being this is a workable alternative. -
eatdrinkfeelgood-1.1-to-indexcard-fo.xsl
XSLT stylesheet to transform an Eatdrinkfeelgood 1.1 XML document into an XSL-FO document, which can then be used to generate
print
-y (read : PDF) versions.There is a known issue, in this stylesheet, where not all the external references are included at the end of an index card. I'm not sure but I think this bug has always been present and will make good use of my daily three hour commute to fix it.
So...
Converting from 1.1 to 2.0
#!/bin/sh EXEC_XSLTPROC=/usr/bin/xsltproc XSL_ERDFG="/home/asc/lib/xsl/eatdrinkfeelgood/eatdrinkfeelgood-1.1-to-2.0a.xsl" EDFG=$1 ERDFG=$2 ${EXEC_XSLTPROC} -o ${ERDFG} ${XSL_ERDFG} $EDFG}
Converting from 2.0 to text
#!/bin/sh EXEC_PYTHON=/usr/bin/python EXEC_TOTEXT="${EXEC_PYTHON} /home/asc/lib/python/erdfg/bin/text.py" ERDFG=$1 TXT=$2 ${EXEC_TOTEXT} ${ERDFG} > ${TXT}
Converting from 2.0 to XHTML, by way of 1.1
#!/bin/sh EXEC_PYTHON=/usr/bin/python EXEC_TOXML="${EXEC_PYTHON} /home/asc/lib/python/erdfg/bin/as_xml.py" EXEC_XSLTPROC=/usr/bin/xsltproc XSL_XHTML="/home/asc/lib/xsl/eatdrinkfeelgood/eatdrinkfeelgood-1.1-to-xhtml.xsl" ERDFG=$1 XHTML=$2 TMP_XML="${ERDFG}.xml" ${EXEC_TOXML} ${ERDFG} > ${TMP_XML} ${EXEC_XSLTPROC} -o ${XHTML} ${XSL_XHTML} ${TMP_XML} rm ${TMP_XML}
- Galatoire's Sweet Potato Cheesecake (2.0)
- Galatoire's Sweet Potato Cheesecake (XHTML)
- Galatoire's Sweet Potato Cheesecake (XHTML+CSS)
Converting from 2.0 to PDF, by way of 1.1
#!/bin/sh EXEC_PYTHON=/usr/bin/python EXEC_TOXML="${EXEC_PYTHON} /home/asc/lib/python/erdfg/bin/as_xml.py" EXEC_XSLTPROC=/usr/bin/xsltproc EXEC_FOP=/Installers/fop-0.20.5/fop.sh XSL_XSLFO="/home/asc/lib/xsl/eatdrinkfeelgood/eatdrinkfeelgood-1.1-to-indexcard-fo.xsl" ERDFG=$1 PDF=$2 TMP_XML="${ERDFG}.xml" TMP_FO="${ERDFG}.fo" ${EXEC_TOXML} ${ERDFG} > ${TMP_XML} ${EXEC_XSLTPROC} -o ${TMP_FO} ${XSL_XSLFO} ${TMP_XML} ${EXEC_FOP} -fo ${TMP_FO} -pdf ${PDF} rm ${TMP_XML} rm ${TMP_FO}
First impressions
The good and the bad are mostly where I expected them to be found.
I really, really, miss being able to use XPath and XInclude. They are like magic and make combining and transforming recipes fantastically easy. Doing the same iterating over graphs, particularly when they are sometimes ambiguous by design, is not so much fun.
However, writing XML (or, by extension, writing and maintaining tools to make writing XML easy) is a pain in the ass. Writing e(r)dfg using this funny hybrid of plain text and N3 is not.
Next steps
In no particular order :
-
Spend some time using e(r)dfg in it's current state.
-
Finish testing the ability to reference external e(r)dfg documents using an XInclude-like syntax.
-
Create a binary drag and drop application to convert e(r)dfg documents into, at least, plain text. Possibly create a simple
viewer
application. -
Hold my nose and write a e(r)dfg -> JSON generator for the Python classes.
-
Write a e(r)dfg -> e(r)dfg
strict
generator for the Python classes and set up a SPARQL server with a simplified REST interface for adding and querying recipes. -
Figure out how to describe
tags
in all of this. I expect that, like ingredients, there will be both a default shorthand as well as a formal (strict) version. Something likedc:subject "recipe dessert cake";
anddc:subject (del:recipe del:dessert del:cake);
, respectively. People are free to write the latter, if that's their thing, but there should also be tools to do it for them.
That is all.
This blog post is full of links.
#nextsteps<you> :a "what you eat" .
Independent of the actual markup format, I've been trying to work out the relationship between the various elements in an Eatdrinkfeelgood recipe. Here's the current working model of the Eatdrinkfeelgood Markup Language, affectionately referred to as e(r)dfg for short.
I've given up trying to find proper names for things right now. For the purpose of this discussion, here are four names that, really, all mean the same thing. But remember : They don't, okay?
Classes
Classes are the highest level elements of an e(r)dfg document and are a conversational shortcut, never explicitly named, that denote how often an item is likely to appear in a document.
-
Singletons
Contain elements that only occur once in a document. For example, a yield or a history. -
Repeatables
Contain elements that may occur multiple times. These may include ingredients, directions or notes. Repeatables are represented assets
.
Classes typically contain all the properties contained in a facet but this is not always the case so it better to think of a property, rather than a facet, as being of a certain type of class.
Facets
Like classes, facets are just a conceptual device to group parts of a document but are not named as such. (At least not explicitly. The RDF peanuts in the gallery would be correct in assuming that classes are treated like (cue drumroll) RDF classes.)
-
Recipes
Contain the following properties : directions; ingredients. At the end of the day, this is really all you'd need to get by. There is a school that says all you need are directions but if the goal is to find some reasonable middle-ground between humans and machines then you need both. -
Descriptions
Contain the following properties : title; description; creator; source; yield. -
Requirements
Contain the following properties: ingredients; equipment; time. -
Guidelines
The following properties : directions; notes. -
Annotations
Contain the following properties : description; notes; history. -
Products
Contains the following properties : yield.
Facets may singletons or repeatables and, in the case of annotations, both. It is probably better to think of a property, rather than a facet, as being a member of a particular class.
Properties
Properties are the actual named parts in a document. They
are like keys in a dictionary and may point to simple
string values or child data-structures referred to, here, as attributes
.
-
title
You'd think this kind of thing would be simple, by nature, but you'd be wrong : -
description
Pedants will point out that it's all adescription
and they'd be right in that way no one cares about. Description, abstract, shorter than ahistory
; less specific than anote
. -
creator
I'm not sure this one will make the cut. I know I had a reason for this but I think I was in a meeting when I added it so who knows what I was thinking. -
source
Where a recipe came from, who wrote it, permalinks, ISBN numbers and all manner of identifiers. -
ingredients
A list ofingredient
dictionaries, or anonymous nodes in RDF-speak. Individual ingredient descriptions are unnamed because it's more of a pain in the ass to spell it out than it is to play a little fast and loose with first principles; this tension is discussed further below. -
directions
A list of unnamedcontent
dictionaries. By content I am talking about a slightly bastardized version of an Atom entry. Since directions are essentially entry-like and, otherwise, difficult to pin down this is a nice in-between solution that lets users enter whatever they want, relies on Atom's type-ing hints to tell machines what to do and makes it easy to use all the XML goodness that comes with the spec. -
notes
Notes are like directions in that they containentries
and, conceptually, areguidelines
. Except that a note is a commentary and a direction is a instruction. -
equipment
This is a sop to the machines and the machine-heads, no pun intended. At the moment it containsutensils
anddevices
attributes which are just lists of some unknown quality. -
time
The time required to prepare a recipe, the time required to cook a recipe. The amount of time, from start to finish, that it will take to get a dish to the table? This one still needs some thought but my current thinking is that there will only be a couple default types but that others may be added with the only requirement that values be formatted as XML Schema Duration strings, e.g.2H15M
. -
history
A history is like a note in the way that they tell you something considered relevant but not necessarily integral to a recipe. Except that notes are an aside and a history is a full-on story. -
yield
Like thetime
property, the core spec will probably on define a couple attributes for adding plain text values. If people want to define more complex values, they can. I've sometimes thought of doing so but never have and I'd be hard-pressed to think of anyone who actually would either.
There are also sets
which aren't listed because
it's another dumb name and I'm not sure where the fall in
the hierarchy of things. They are singletons that contain
repeatables. A document may only contain one collection of
sets, though it may have many parts. For example, a recipe
for pie may be divided into two parts : the crust and the
filling. Each part may contain more than the minimum set of
repeatables : ingredients and directions followed by notes,
equipment and other specifics.
Similarly, there is a framework for using an XInclude-like syntax to pull in pieces of another e(r)dfg document using SPARQL.
These last two are by no means required and will probably
only ever enjoy limited use. But in the spirit of making
easy things easy and hard things possible
they still
seem worth the effort.
Attributes
These are the guts of a given document (amount=5,
measure=cup, etc.) and I'm not going to get in to them
here. Part of the reason for this whole exercise has been to
try and articulate a context in which attributes
live
to better identify what is and isn't necessary.
But how does it taste?
The reasons for spelling it all out like this are two-fold (if you exclude the obvious part where you want to figure out what the core elements of a recipe document are) :
- How to go about recording it as a written document and how much trouble it will be for a human being to do so.
- How much trouble it will be for a machine to parse that document into something useful and how much pain it will bring to a human who has to teach the computer what to do.
Did I mention how much I hate using non-XML formats for data exchange? I don't really have a pithy comment for that except to say : Yes. On measure, native XML just doesn't make as much sense as it once did. It is hard to read by eye, hard to write by hand and involves a sufficiently complicated setup to do anything otherwise that it's not really attractive to non-dorks and lazy dorks alike.
In the same vein, I prefer to not think of e(r)dfg, despites its clever name, as RDF either. More likely it will be presented as a plain-text format, with a formal set of markup rules and relationships, that conveniently happens to be RDF. In that regard, it will be more XML-ish in nature meaning that from a processor's point of view where the core set elements will have a fixed set of combinations. (And if you really need more variations on a document title than those described above maybe you need to be asking some larger questions about your life.)
I think I am comfortable with this. Or, at least, I know that the sacrifices demanded by any of the available extremes are not.
Waiter, there RDF in my soup!
The other day, Ed posted a recipe for chicken soup and I decided to use it as a test case for everything I've been working on. Would it be easy enough to enter by hand? How much of Ed's original recipe could I simply copy and paste? Could I read the formatted version easily afterwards? Could I write a quick and dirty computer program to dump a version of the recipe that read like the recipes we scribble on index cards?
Here's what I had to work with :
For the most part, it went well. The ingredients list while still a bit of a nuisance is actually an improvement. It is easier to read than the XML-ified Eatdrinkfeelgood 1.x documents. It is easier to write, and to remember what to write, than 1.x documents. It avoids both the natural language processing rabbit-hole and provides for richer semantics, than earlier versions of the spec, allowing users to define the values for measures and dstuffs as resources themselves.
The directions and notes were copied and pasted and while, in my version, I've lost the explicit semantics of the paragraphs I could have also just copied the raw HTML had I been so inclined. The Atom content model is handy that way.
The biggest problem, so far, is in processing the
recipe as a traditional RDF document. Specifically, there's
nothing that says an ingredients list can't have a list of
anonymous nodes (the individual ingredient descriptions) but
it does make them hard(er) to find
. The first problem
is the way that lists, in RDF, are interpreted as being a
series of micro-lists consisting of two elements : The first
item in the rest of the list, followed by a pointer to the
rest of the list itself. The second problem is that the
un-name-edness of individual ingredients means you have to
test each element in the ingredients list
first to see whether it's a node and then whether it
contains something ingredient-ish like a dstuff
.
Really, not so hot.
The obvious solution to this would be to assign a type to each
ingredient description. Instead of : [e:dstuff
"butter"]
you'd write [e:dstuff "butter"; :a
e:ingredient]
. Or e:ingredients [ e:ingredient
( [...] [...] ) ]
. This makes the computer happy but also makes the baby
Jesus cry. Seriously, if I'd wanted to get into that kind of
markup soup I would have just recast the whole language as a
microformat.
Writing a document should be possible with nothing more than a text editor and a little bit of patience, a straightforward, albeit boring affair. Hiding the details of the markup behind a user-friendly graphical interface would be a welcome improvement for many but it should not be a requirement.
I
wrote that and I still stand by it. It is the primary
motivation behind the current iteration of the format and
the pull between making it easy to write
one-liners
versus being able, or willing, to write
the recipes at all is what I'm still trying to sort out.
I love to spend time thinking about fancy GUI apps, using sexy tricks like XForms and auto-completion-y scrumjax, to read and write recipes but I honestly don't think that's how it's going to happen most of the time. More often than not I will be reading recipes from something like the 770 which won't be connected to the Internet or writing them in a copy of Notepad and emailing the finished version to myself from someone else's computer.
If I can use the same short-hand (format) for both the reading and the writing then maybe the extra time, and hassle, it will take to teach a computer program to DWIM is worth it.
Another option is to define strict
and
casual
modes for a document where the former would
mandate the use of [:a e:thingy]
attributes
(careful readers will note that the same issues surrounding
ingredients apply to both notes and directions) and
the latter would not. Glue-code to map the latter to the
former would be easy enough to write and maintain. To that
end, you'd have a pipeline that looked something like this :
casual -> strict -> XML -> HTML, XSL-FO
(PDF), etc.
Next steps
Or something like that.
This blog post is full of links.
#erdfg