[New-ITS] Node counts
Robert Worden
Robert.Worden at Charteris.com
Thu Aug 31 17:50:05 BST 2006
David -
Just to clarify what I have counted, because we seem to have got wires
crossed on factors of 10.
The things I am counting are distinct nodes in a message definition, not
allowing repeats of nodes or self-nesting. Put another way, I am
counting distinct XPaths in the message definition, short of
self-nesting. Inasmuch as each XPath is a name for a set of nodes, that
is sort-of counting the names of nodes in the message.
I made these counts in two ways:
1. going right down to leaf nodes of the XML
2. stopping at V3 data type nodes, counting each V3 data type top
node as 1.
Doing this for a couple of messages, I got:
- PA 'new person' - 1100 XPaths down to data type nodes, 77000 nodes
down to leaves
- Lab 'observation request' 65,000 XPaths down to data type nodes, 4.4
million down to leaves.
So the 65,000 is not 10% of anything - sorry to have misled you. BTW the
lab number only includes 1 of the 4 choices at the top.
For both messages, the average multiplier from data type nodes to leaves
is 70. This is dominated by a few large data types which have large
counts on leaf nodes:
type: PQ; count: 175
type: IVL_TS; count: 548
type: AD; count: 135
type: PN; count: 592
type: CV; count: 129
type: CR; count: 213
type: ON; count: 576
type: CE; count: 383
type: CD; count: 129
type: PQR; count: 171
type: EN; count: 592
When you weight these by occurrences, there are a few main culprits: Out
of the 4.4 million leaf nodes in the lab message:
type: ON; count: 315072
type: II; count: 108030
type: CE; count: 1657241
type: IVL_TS; count: 980920
type: AD; count: 238545
type: EN; count: 873792
Now I guess the big data types are almost never nearly 500 nodes in
instances, and simplified versions can be learned by developers for use
in 99% of cases; so the huge numbers of leaves (average 70 per data
type) does not worry me too much. But 65,000 separate XPaths in one
message, before you get down to data types, does worry me.
By the way, I was really quoting these numbers to agree with you - if
node count is at all sensible as a measure of complexity, then any
automatic or centrally decreed message flattening will do no good at
all, because it hardly reduces node count. I don't think flattening per
se will reduce the developer learning curve. I agree with you that HL7
should not increase the burden of message formats it imposes on itself
or its implementers. It should let them do their own message reshaping
when they want to - knowing they can always 'reflate' messages to be V3
conformant, automatically and accurately, if their recipients prefer a
RIM-centric approach.
Best wishes
Robert
Charteris plc, Charteris House, 39/40 Bartholomew Close, London
EC1A 7JN
phone: 44 1353 777668; Fax 44 1353 777394; mobile: 44 7970
197968
*************************** E-mail confidentiality notice
*******************************
This message is intended for the addressee only. It is private,
confidential and may contain
information of a proprietary nature. If you have received this message
in error, please notify
us and remove it from your system.
_________________________________________________________________
This message from Charteris plc has been checked for viruses
http://www.charteris.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.hl7.org.uk/pipermail/new-its/attachments/20060831/eec30d91/attachment.htm
More information about the New-ITS
mailing list