How to get, and parse XML for storing in a DB in java

Initialize!

I first read from a config file to get needed DB connect & directory parameters. I also read in a pre-stored-procedure and post-stored-procedure optional parameters in case we decide to do some db-side dirty work later.

Getting the XML file

A previous post had a nifty utility for getting files from a website using perl. That was nice, but for this exercise, we’ll need to use java so we can use common logging & classes with other java apps.

Using the DB connection from initializing, i look up the URLs I need to pull. This will prevent me from having to change code later for hard-coded crap like filenames (which are likely to change a lot).

Next – I loop through each file name and store it. The example below shows the actual pull:

https://gist.github.com/730902

Parsing the XML file

Knowing what XML you’re getting each run is going to be a problem. Attributes are added, structures are changed, etc. The XML I’m dealing with doesn’t even have a key structure added in. So the first thing we need to do is to defined the parent/child relationships. If those don’t match what we expect – we should fail or perform some error-handling. Otherwise, we know at least that the data we are loading conforms to our structure.

So XML to DB tables… think of it as this:

https://gist.github.com/730939

Not great, I know. It’s better to have company_fk on the manager/employee tables then to have an org table. A manager & employee can be stored in the same table too. However, since we don’t know what the xml could look like the next time we run this – the structure we create suffices for now. (After this process, I have a stored procedure to get the data into the preferred data format).

So, how do I do this in java?

https://gist.github.com/730985

This site makes more use of the treeWalker methods than I do and it really helped me out:

http://oreilly.com/catalog/jenut2/chapter/ch19.html

	2048 part 2 on 2048
	Quick primer on Mong… on Quick primer on MongoDB from a…
	dwcramer on Pygmalion and Praise
	jsavak on Accelerating REST: Rate-limiti…
	David Cramer on Accelerating REST: Rate-limiti…

Share this:

Related

Leave a comment Cancel reply