One of the few problems with using Python to process XML is the speed -- if the XML becomes somewhat large (>1Mb), it slows down exponentially as the size of the XML increases. One way to increase the processing speed is to break the XML down via tag name. This is especially handy if you are only interested in one part of the XML, or between certain elements throughout the XML.
This script contains a function that handle this problem. It uses the Sax reader from PyXML.
The In parameters are the XML as a string, the tag name that you want to build the DOM around, and an optional postition to start at within the XML. It returns a DOM tree and the character position that it stopped at.
Top 4 Download periodically updates information of Breaking large XML documents into chunks script from the developer, but some information may be slightly out-of-date.
Our script download links are directly from our mirrors or publisher's website. Breaking large XML documents into chunks torrent files or shared files from free file sharing and free upload services, including Rapidshare, MegaUpload, YouSendIt, MailBigFile, DropSend, HellShare, HotFile, FileServe, MediaMax, zUpload, MyOtherDrive, SendSpace, DepositFiles, Letitbit, LeapFile, DivShare or MediaFire, are not allowed!