Dred Scott issue solved. The file weighs in at 750k. The simple_html_dom.php library I’m using had MAX_FILE_SIZE set to 600k. I pushed that to 10Mb just to be on the safe side. That did the trick.
Updates from September, 2012 Toggle Comment Threads | Keyboard Shortcuts
Elmer Masters
Elmer Masters
So, processing of SCOTUS opinions stumbles on Dred Scott decision. No idea why. It is bigger than most, but only 750K. Same error on 2 machines. Seems like DOM is unable to open HTML file as an object. I’m looking into stuff like strange entities but nothing is popping out and I can open file in Firefox.
Elmer Masters
Watching docs process on the FLR server is like watching paint dry. Every opinions has to be opened, have metadata added or modified, and then be inserted into Solr. Takes some time.
Elmer Masters
Realized that the index.html files in all of those directories do not need to be modified or indexed. Updated the code to ignore index.html files.
Elmer Masters
Finished up testing of SCOTUS cases from the 2008 PRO collections. Ready to load them into FLR. Should be interesting.
Elmer Masters
Before we get to the EPUBs though we need to make sure that all of that great meta data is being put into Solr. Finding these opinions is more important than recreating the book volumes.
Elmer Masters
So, since I’m already running through all these opinions, I’m going to create EPUB versions of US Reports. Watch for availability.
Elmer Masters
Back into Free Law Reporter today. I’m adding additional meta data to SCOTUS opinions. Also decided the PRO case dump from 2008 would be represented as Free Law Reporter 3rd. BTW, I enjoy making up my own reporter series.
Elmer Masters
More bans on the Drupal system today. An IP address that belongs to Verizon FIOS and the panscient.com spider. Both repeatedly opened thousands of connections to seemingly random pages on the site over the course of 90 minutes this afternoon. Bad Spiders.
Elmer Masters
Turns out unpacking the Fastcase Advance Sheets .MOBI version is pretty straight forward. There’s a hunk of python for that at http://wiki.mobileread.com/wiki/Mobi_unpack. Once unpacked it’s all about processing the HTML into individual opinion files.
It would be nice to use the .EPUB files, but at the moment those are locked away in iBooks.