Much Improved 'dump' function and XML simplify

Posted by horacebury, Posted on February 2, 2012, Last updated February 21, 2012

0 votes

UPDATE: This post now has a lot more functionality, described just before the code...

Original blog post: http://blog.anscamobile.com/2011/07/how-to-use-xml-files-in-corona/

Original shared code: https://developer.anscamobile.com/code/xml-table-parser

This code is an update/huge improvement to my original post, above.

The dump function traverses a table's contained tree and prints all of it's content to the console. It will print the code you would need in order to access the data in the table.

The xml.lua adds the simplify function (replacing the code in my previous post, above) to take the output of the loadFile function and structure it as if you would build the table yourself. It prints the structure as it does this, which should show you how the tables are built.

Simplify will take the xml elements and create a table for them, containing named indices for each xml element property. For multiple xml elements, it will make the named table a numerical table so that each element can be accessed in order.

Running the code below will first print the structure as the simplify function sees the xml and then the code you would use to access the data, as seen by the dump function.

If you have questions please either post here or mail me at horace dot bury [the usual gmail com].

(I should note that, as made obvious by the sample xml file below, yes - I am building a game level editor utilising XML.)

Update:
I have just added some more functions to make the XML library much more useful. From now the main.lua demonstrates the following:

loadFile: (original) Loads an XML file into a table
saveFile: (new) Saves an XML file from a table (requires the root element name as a parameter)
toXml: (new) Converts the table returned from loadFile into a string. (Used by saveFile.)
simplify: (new) Converts the table returned from loadFile into a more usable table structure. For example, a collection of elements with the same name would not reside within a table called child, but would become a table of that name - with '.' notation access.
desimplify: (new) Reverses the simplify operation, converting the simplified table back to the format as returned by the loadFile function.

For examples and how to use the code look at and run the main.lua with the levelone.xml file below.

main.lua:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
local xmlapi = require( "xml" ).newParser()
local xml = xmlapi:loadFile( "levelone.xml" )
local bag = {}
 
function dump(t, indent)
        local notindent = (indent == nil)
        if (notindent) then print('-----dump-----'); indent='{}'; end
        if (t and type(t) == 'table') then
                for k, v in pairs(t) do
                        if (type(k) ~= 'number') then
                                print(indent .. '.' .. k .. ' = ' .. tostring(v))
                                if (indent) then
                                        dump(v, indent..'.'..k)
                                end
                        end
                end
                for i=1, #t do
                        print(indent .. '[' .. i .. '] = ' .. tostring(t[i]))
                        dump(t[i], indent .. '[' .. i .. ']')
                end
        end
        if (notindent) then print('-----dump-----'); end
end
 
print('\n\nThe tables as originally accessed:\n\n')
 
dump(xml)
 
print('\n\nThe tables converted back to XML using XmlParser:toXml():\n\n')
 
print(xmlapi:toXml('level', xml))
 
local success = xmlapi:saveFile("test.xml", system.DocumentsDirectory, 'level', xml)
print('\n\n----------\nWriting XML success: ', success, '\n')
print('Written to: ', system.pathForFile( "test.xml", system.DocumentsDirectory), '\n----------')
print('\nContent written:\n\n')
for line in io.lines(system.pathForFile( "test.xml", system.DocumentsDirectory)) do print(line) end
 
print('\n\nThe XML turned into a named table structure as XmlParser:simplify sees it:\n\n')
 
bag = xmlapi:simplify( xml )
 
print('\n\nThe simplified table structure as you would access it in code, printed using dump:\n\n')
 
dump(bag)
 
print('\n\nThe de-simplified table structure:\n\n')
 
dump(xmlapi:desimplify('level', bag))
 
print('\n\nThe de-simplified table converted back to XML:\n\n')
 
print(xmlapi:toXml('level', xmlapi:desimplify('level', bag)))

xml.lua:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
module(..., package.seeall)
 
---------------------------------------------------------------------------------
---------------------------------------------------------------------------------
--
-- xml.lua - XML parser for use with the Corona SDK.
--
-- version: 1.1
--
-- CHANGELOG:
--
-- 1.1 - Fixed base directory issue with the loadFile() function.
--
-- NOTE: This is a modified version of Alexander Makeev's Lua-only XML parser
-- found here: http://lua-users.org/wiki/LuaXml
--
---------------------------------------------------------------------------------
---------------------------------------------------------------------------------
 
function newParser()
 
        XmlParser = {};
 
        function XmlParser:ToXmlString(value)
                value = string.gsub (value, "&", "&&");         -- '&' -> "&"
                value = string.gsub (value, "<", "&lt;");               -- '<' -> "<"
                value = string.gsub (value, ">", "&gt;");               -- '>' -> ">"
                value = string.gsub (value, "\"", "&quot;");    -- '"' -> """
                value = string.gsub(value, "([^%w%&%;%p%\t% ])",
                        function (c)
                                return string.format("&#x%X;", string.byte(c))
                        end);
                return value;
        end
 
        function XmlParser:FromXmlString(value)
                value = string.gsub(value, "&#x([%x]+)%;",
                        function(h)
                                return string.char(tonumber(h,16))
                        end);
                value = string.gsub(value, "&#([0-9]+)%;",
                        function(h)
                                return string.char(tonumber(h,10))
                        end);
                value = string.gsub (value, """, "\"");
                value = string.gsub (value, "&apos;", "'");
                value = string.gsub (value, ">", ">");
                value = string.gsub (value, "<", "<");
                value = string.gsub (value, "&", "&");
                return value;
        end
 
        function XmlParser:ParseArgs(s)
          local arg = {}
          string.gsub(s, "(%w+)=([\"'])(.-)%2", function (w, _, a)
                        arg[w] = self:FromXmlString(a);
                end)
          return arg
        end
 
        function XmlParser:ParseXmlText(xmlText)
          local stack = {}
          local top = {name=nil,value=nil,properties={},child={}}
          table.insert(stack, top)
          local ni,c,label,xarg, empty
          local i, j = 1, 1
          while true do
                ni,j,c,label,xarg, empty = string.find(xmlText, "<(%/?)([%w:]+)(.-)(%/?)>", i)
                if not ni then break end
                local text = string.sub(xmlText, i, ni-1);
                if not string.find(text, "^%s*$") then
                  top.value=(top.value or "")..self:FromXmlString(text);
                end
                if empty == "/" then  -- empty element tag
                  table.insert(top.child, {name=label,value=nil,properties=self:ParseArgs(xarg),child={}})
                elseif c == "" then   -- start tag
                  top = {name=label, value=nil, properties=self:ParseArgs(xarg), child={}}
                  table.insert(stack, top)   -- new level
                else  -- end tag
                  local toclose = table.remove(stack)  -- remove top
                  top = stack[#stack]
                  if #stack < 1 then
                        error("XmlParser: nothing to close with "..label)
                  end
                  if toclose.name ~= label then
                        error("XmlParser: trying to close "..toclose.name.." with "..label)
                  end
                  table.insert(top.child, toclose)
                end
                i = j+1
          end
          local text = string.sub(xmlText, i);
          if not string.find(text, "^%s*$") then
                  stack[#stack].value=(stack[#stack].value or "")..self:FromXmlString(text);
          end
          if #stack > 1 then
                error("XmlParser: unclosed "..stack[stack.n].name)
          end
          return stack[1].child[1];
        end
 
        function XmlParser:saveFile(xmlFilename, base, rootElementName, xmltbl)
                if not base then
                        base = system.TemporaryDirectory
                end
 
                local path = system.pathForFile( xmlFilename, base )
                local hFile, err = io.open(path, "w")
 
                if hFile and not err then
                        hFile:write(XmlParser:toXml(rootElementName, xmltbl))
                        io.close(hFile)
                        return true
                else
                        print( err )
                        return false
                end
        end
 
        function XmlParser:loadFile(xmlFilename, base)
                if not base then
                        base = system.ResourceDirectory
                end
 
                local path = system.pathForFile( xmlFilename, base )
                local hFile, err = io.open(path,"r")
 
                if hFile and not err then
                        local xmlText=hFile:read("*a"); -- read file content
                        io.close(hFile);
                        return self:ParseXmlText(xmlText),nil;
                else
                        print( err )
                        return nil
                end
        end
 
        function XmlParser:toXml(wrapElementName, xmltbl)
                local function getXml(xmltbl, indent)
                        -- collect tag properties
                        local props = ''
                        for k, v in pairs(xmltbl.properties) do
                                if (k ~= 'value' and type(v) ~= 'function') then
                                        props = props .. ' ' .. k .. '="' .. v .. '"'
                                end
                        end
                        -- build element
                        if (xmltbl.value ~= nil) then
                                -- open with body content
                                return indent .. '<' .. xmltbl.name .. props .. '>'
                                        .. tostring(xmltbl.value)
                                        .. '</' .. xmltbl.name .. '>'
                        elseif (#xmltbl.child > 0) then
                                -- open element with content
                                local str = ''
                                for i=1, #xmltbl.child do
                                        if (type(xmltbl.child[i]) ~= 'function') then
                                                str = str .. '\n' .. getXml(xmltbl.child[i], '   '..indent)
                                        end
                                end
                                return indent .. '<' .. tostring(xmltbl.name) .. props .. '>'
                                        .. str
                                        .. '\n' .. indent .. '</' .. tostring(xmltbl.name) .. '>'
                        else
                                -- self terminating
                                if (props == '') then
                                        return indent .. '</' .. xmltbl.name .. '>'
                                else
                                        return indent .. '<' .. xmltbl.name .. props .. ' />'
                                end
                        end
                end
 
                return getXml(xmltbl, '')
        end
 
        function XmlParser:simplify( xml, tbl, indent )
                if (indent == nil) then indent = ''; else indent = indent .. '   '; end
                if (tbl == nil) then tbl = {}; end
 
                tbl['__special'] = nil
                local function addSpecial( key, value )
                        if (tbl['__special'] == nil) then tbl['__special'] = {}; end
                        tbl['__special'][key] = value
                end
 
                print(indent .. xml.name)
                for k, v in pairs(xml.properties) do
                        print(indent .. '   .' .. k .. ' = ' .. v)
                        tbl[k] = v
                end
 
                if (xml.value ~= nil) then
                        print(indent .. '   "' .. xml.value .. '"')
                        tbl.value = xml.value
                end
 
                if (#xml.child > 0) then print(indent..'{'); end
                for i=1, #xml.child do
                        local name = xml.child[i].name
                        local t = tbl[name]
                        local v = XmlParser:simplify( xml.child[i], nil, indent )
            if (t == nil) then
                -- element name not seen yet
                tbl[name] = v
            elseif (#t == 0) then
                -- second sighting of element name, convert into table
                print(indent .. '   ,')
                t = { t }
                tbl[name] = t
                t[2] = v
            else
                -- numerous sighting of element name, add to table
                print(indent .. '   ,')
                t[#t+1] = v
            end
                        if (type(v) == "string") then
                                addSpecial( name, "isbody" )
                        end
                end
                if (#xml.child > 0) then print(indent..'}'); end
 
        function tablelength(T)
            local count = 0
            for _ in pairs(T) do count = count + 1 end
            return count
        end
 
        local tblAttrCnt = tablelength(tbl)
 
        if tbl.value and tblAttrCnt == 1 then
            -- If an entity has only a value, then make that the value of the entity (instead of a .value attribute)
            tbl = tbl.value
        end
 
                return tbl
        end
 
        function XmlParser:desimplify( name, tbl, indent )
                if (not indent) then indent = ''; end
                local t = { name=name, properties={}, child={} }
 
                if (#tbl == 0) then
                        for k, v in pairs(tbl) do
                                if (k ~= "__special") then
                                        if (type(v) == 'table') then
                                                if (#v == 0) then
                                                        t.child[ #t.child+1 ] = XmlParser:desimplify(k, v, indent..'   ')
                                                else
                                                        for i=1, #v do
                                                                t.child[ #t.child+1 ] = XmlParser:desimplify(k, v[i], indent..'   ')
                                                        end
                                                end
                                        else
                                                local isbody = tbl['__special'] and tbl['__special'][k] == 'isbody'
 
                                                if (isbody) then
                                                        t.child[ #t.child+1 ] = { name=k, properties={}, child={}, value=v }
                                                elseif (k == 'value' and not isbody) then
                                                        t.value = v
                                                else
                                                        t.properties[k] = v
                                                end
                                        end
                                end
                        end
                end
 
                return t
        end
 
        return XmlParser
end

levelone.xml (Mu sample xml file from my level editor):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<?xml version="1.0"?>
<level mode="edit">
        <resources>
                <requires />
                <images />
                <bodies />
                <joints />
        </resources>
        <scene>
                <group name="background">
                        <image name="bg1" x="contentCenterX" y="contentCenterY" />
                </group>
                <group name="balloons">
                        <body name="blue" path="bg1.png" x="100" y="100" require="balloon" physicsdata="blue" />
                        <body name="icon" path="icon2-down.png" x="300" y="100" />
                        <body name="test" path="test.png">Test</body>
                </group>
        </scene>
</level>


Replies

bob.dickinson's picture
bob.dickinson
User is online Online
Joined: 15 Dec 2011

I'm in the process of implementing Amazon S3 support from Corona and I found this very useful for parsing the XML responses and then converting them to Lua tables (using simplify) that are representative of those XML structures. Thanks!

One note: If you copy the code for xml.lua from above, you will have issue. Those first two functions have a bunch of ampersand escaped values in the original code which get converted to entities in the HTML display on this page (and thus lost in the process). I just copied the version of those two functions from the github version and pasted them into the code I copied from above and it worked fine.

I think I will make one tweak to simplify(), and that is that I have a lot of XML that uses only values (and no attributes), so I get a lot of this:

.Contents[1].LastModified = table: 025F6858
.Contents[1].LastModified.value = 2012-02-18T23:28:35.000Z
.Contents[1].Key = table: 025F7488
.Contents[1].Key.value = boulder.png
.Contents[1].StorageClass = table: 02612420
.Contents[1].StorageClass.value = STANDARD
.Contents[1].ETag = table: 025F6420
.Contents[1].ETag.value = "ea9035ce951323d8c66a3c4dabda9e64"
.Contents[1].Owner = table: 025F6AD8
.Contents[1].Owner.ID = table: 02612DD0

I think I'll make it so that if the only entry is a value entry, it will promote it up one level (so I can just do Contents[1].Key, for example).

But anyway, thanks again. Saved me some work.

bob.dickinson's picture
bob.dickinson
User is online Online
Joined: 15 Dec 2011

Here's what I did to simplify():

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
    function XmlParser:simplify( xml, tbl, indent )
        if (indent == nil) then indent = ''; else indent = indent .. '   '; end
        if (tbl == nil) then tbl = {}; end
 
        print(indent .. xml.name)
        for k, v in pairs(xml.properties) do
            print(indent .. '   .' .. k .. ' = ' .. v)
            tbl[k] = v
        end
 
        if (xml.value ~= nil) then
            print(indent .. '   "' .. xml.value .. '"')
            tbl.value = xml.value
        end
 
        if (#xml.child > 0) then print(indent..'{'); end
        for i=1, #xml.child do
            local name = xml.child[i].name
            local t = tbl[name]
            if (t == nil) then
                -- element name not seen yet
                tbl[name] = XmlParser:simplify( xml.child[i], nil, indent )
            elseif (#t == 0) then
                -- second sighting of element name, convert into table
                print(indent .. '   ,')
                t = { t }
                tbl[name] = t
                t[2] = XmlParser:simplify( xml.child[i], nil, indent )
            else
                -- numerous sighting of element name, add to table
                print(indent .. '   ,')
                t[#t+1] = XmlParser:simplify( xml.child[i], nil, indent )
            end
        end
        if (#xml.child > 0) then print(indent..'}'); end
        
        function tablelength(T)
            local count = 0
            for _ in pairs(T) do count = count + 1 end
            return count
        end
 
        local tblAttrCnt = tablelength(tbl)
        if tblAttrCnt == 0 then
            -- If an entity has no attributes/values, make it an empty string instead of an empty table
            tbl = ""
        elseif tbl.value and tblAttrCnt == 1 then
            -- If an entity has only a value, then make that the value of the entity (instead of a .value attribute)               
            tbl = tbl.value
        end
 
        return tbl
    end

Now I get a nice clean table like this:

.Contents[1].LastModified = 2012-02-18T23:28:35.000Z
.Contents[1].Key = boulder.png
.Contents[1].StorageClass = STANDARD
.Contents[1].ETag = "ea9035ce951323d8c66a3c4dabda9e64"
.Contents[1].Owner = table: 00FAAD18
.Contents[1].Owner.ID = bf17d55f14d870a468c62df9e44193847cddfe0c9ef5a82a5b809f0533154a04
.Contents[1].Owner.DisplayName = bigdog
.Contents[1].Size = 15129

I did not do the reverse of this in desimplify(). I think I'd need some test cases to make sure I didn't mess that up.

Anyway, not sure if this is in line with your original intent, but it works better for my use case. Also not entirely sure I didn't break some part of simplify() that I wasn't using. I'd appreciate it if you'd let me know if you see any problems.

horacebury's picture
horacebury
User offline. Last seen 1 hour 29 min ago. Offline
Joined: 17 Aug 2010

Hey, nice work. I will try to fix the HTML encoding issue in the post.

The only reason I decided against processing body values of elements as direct values is because during desimplification they would become an attribute of their parent element which both breaks the source XML and, in cases where the body content is long or multi-line, become invalid XML.

I have been thinking about improving it using private control tables to indicate special cases, so I'll see what I can do...

horacebury's picture
horacebury
User offline. Last seen 1 hour 29 min ago. Offline
Joined: 17 Aug 2010

OK @bob, I've been taking a look at this, this morning, and I think I've added your improvement and a bit more...

It took a bit of work, but I've added the first use of the __special table to parent element tables. This means that the simplify table will now have indicators when a child element contained only body text and no properties.

Comment out the IF which reads:
if (k ~= "__special") then
and you'll see what the simplified table actually contains.

I have also removed one of your conditions: When the child element contains no properties and no body text (is self-terminating) I believe it should be simplified into an empty table because I believe the rule should be "no body text means it is not a string" on the basis that an array with nothing in it is just an empty box. But this is open to interpretation and preference, I suppose.

Anyway, now the __special control table is implemented, there's a lot of room for improvement and with my text XML files I could not see your version of the code breaking anything from the original. Frankly, the original code mashes my head a lot!

Dotnaught's picture
Dotnaught
User offline. Last seen 3 days 3 hours ago. Offline
Joined: 7 Jul 2009

Thanks for working on this. I assume this would work with Amazon Simple DB too?

horacebury's picture
horacebury
User offline. Last seen 1 hour 29 min ago. Offline
Joined: 17 Aug 2010

I have no idea I'm afraid.

bob.dickinson's picture
bob.dickinson
User is online Online
Joined: 15 Dec 2011

@Dotnaught - I've been using a tweaked version of this in my Amazon S3 library without any issues, so my guess is that it would handle SimpleDB. Are you working on a SimpleDB library by any chance?

Dotnaught's picture
Dotnaught
User offline. Last seen 3 days 3 hours ago. Offline
Joined: 7 Jul 2009

I'm thinking about it. I may need more than basic CRUD functionality, in which case I would probably go with a custom App Engine client account module or Parse/StackMob. But SimpleDB seems like the best option for basic storage at a reasonable price at the moment. I might try plugging your S3 library in to see how it works with SimpleDB.