[jsword-devel] Out of Memory Issues Loading repo module lists
    DM Smith 
    dmsmith at crosswire.org
       
    Mon Jan 11 19:34:48 MST 2016
    
    
  
I’ve sent you a zip with the SwordBookMetaData, AbstractSwordInstaller and a test that was modified.
— DM
> On Jan 11, 2016, at 6:29 PM, DM Smith <dmsmith at crosswire.org> wrote:
> 
>> 
>> On Jan 11, 2016, at 6:07 PM, Martin Denham <mjdenham at gmail.com <mailto:mjdenham at gmail.com>> wrote:
>> 
>> My estimate of file size might be too low because I forgot to take into account block size.  Quickly playing around with my android adds about 40% making it at least 7Mb for the conf files.
> 
> Understand.
> 
>> 
>> By 'fluff' do you mean extract all the files from mods.d.tar.gz and write them all to disk.  I am a little concerned about writing and deleting hundreds of small files to the sd card repeatedly.  SD cards are not as good at high r/w as normal disks or flash drives.  That is the reason I do not store the AB database on the SD card.
> 
> By fluff, I meant that the conf file would be re-read without a filter, thus getting everything.
> 
>> 
>> The process described would also make viewing a description (in AB right-click About) an unexpectedly expensive operation involving writing hundreds of files to the sd card.
> 
> It would involve re-reading the one file without a filter. It should happen fast.
>> 
>> I did not know about sbmd.toOSIS() and have not used it.  AB just pops up a little dialog with a few fields like About, copyright, licence, version, versification.
> 
> Ok. Then you’ll need to call “fluff” before retreiving those fields. The code for fluff (or whatever we call it) would be something like:
> public void fluff() {
>   if (partiallyLoaded) {
> 	re-read and process the conf without a filter
> 	partiallyLoaded = false;
>    }
> }
> 
>> 
>> For the 2 reasons above my preference would be to avoid writing hundreds of files to the SD card but I can't think of a perfect solution.  While grappling with this last week I was just trying to get the original code to work more efficiently (but failed).  I am not very experienced in Memory Analysis but suspected the memory use was higher than it might have been.
> By design, which files do you write to SD card? If they are only written when the mods.d.tar.gz is downloaded, would that help?
> 
>> 
>> If the tar.gz was searched each time for the conf it would be more expensive to process the tar.gz every time a Description is requested but the first time it would be quicker than writing hundreds of .conf files and to be honest I think a lot of people do not know about the long-press menu in AB so probably just the initial list of modules would be used most of the time.
> 
> I don’t know about the long-press menu. In BibleDesktop, it is easy to navigate from one available to the next and each time it shows the full conf.
> 
>> 
>> Coincidentally my android slowed to a crawl when I tried to copy all of eBible's .conf files to it just now - initially fast then 1 file per 3 secs, after 10 minutes I unplugged it, although that probably is not a realistic test and there is probably an explanation for the issue.
> 
> I’ve nearly got the code written to unpack the conf. Let me zip up the files that have changed and send them to you.
> 
> Basically, if you delete mods.d.tar.gz, it will fetch a new one (current behavior). If you delete mods.d/ it will unpack mods.d.tar.gz into it. If you fetch mods.d.tar.gz it will unpack it into mods.d. All of this takes place in the folder that mods.d.tar.gz is present.
> 
> I tried adding new confs to mods.d that weren’t in mods.d.tar.gz to simulate a takedown and that works as well.
> 
> If this code is no good for you, I’ve another thought. A file in a jar has an URL that is something like …/fred.jar!file. Maybe we can transform the mods.d.tar.gz into mods.d.tar and use that addressing mechanism to fetch the file? I’ll take a look at how the JRE does that. Maybe, I’ll roll the same for JSword over a tar.gz file.
> 
> DM
> 
>> 
>> Martin
>> 
>> On 11 January 2016 at 19:28, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
>> I have been thinking about this a bit more. I was knew there was a need to prevent stale confs. The time performance is something that I’m not able to test. My machine has an SSD, a fast 4 core CPU and gobs of RAM. So I need you to keep me in line. ;)
>> 
>> The easiest way to keep it pristine is to unpack it into a temporary folder, rename the old folder and then rename the new folder. Finally deleting the old folder. By doing it in this order it minimizes the time that mods.d is unavailable. Important for multi-threaded apps and multiple apps that share the same machine simultaneously.
>> 
>> Right now the SwordBookMetaData remembers the File for the conf of installed modules and is able to re-read it easily. But it does not store anything about a conf’s location when it is from mods.d.tar.gz. I suppose I could have it remember the location of mods.d.tar.gz and the name of the conf entry and create a method to extract a that conf out of the compressed archive. This would need to be done for each module that the user requests info. To do this is quite expensive as it means inflating the file then iterating over the contents until the desired conf is found.
>> 
>> I think that it would be better to see how much time it adds to extract the files and store them on disk. The fluffing of them would only be when the user wants to browse a description of the module.
>> 
>> I’d like to modify sbmd.toOSIS to check if the sbmd is partial or full and if not full re-read the conf fully and then continue as before. I think that is how JSword is designed to retreive the conf for presentation to the end user. Does AndBible use that or some other mechanism to get what it wants for presentation?
>> 
>> I think I’ll add a “fluff” method to BookMetaData that will do this. This could be called to get it to fluff at another time.
>> 
>> DM
>> 
>>> On Jan 11, 2016, at 1:00 PM, Martin Denham <mjdenham at gmail.com <mailto:mjdenham at gmail.com>> wrote:
>>> 
>>> My rough estimates have the total size of conf files in all repos at about 5Mb which is not too different to the size of a module like ESV so the impact should not be significant and it should not be a problem if this is required.
>>> 
>>> Other things to consider that come to mind i) would need to remove conf files no longer in mods.d.tar.gz or delete and re-extract everything after a refresh ii) Time taken to save files - loading the list is already slow.
>>> 
>>> I can't think of any major reason not to do as you describe.
>>> 
>>> However, would an easier approach be to find files in the zip a bit like this <http://stackoverflow.com/questions/11123528/finding-a-file-in-zipentry-java>.  Speed would not be an issue because it would only be done once or twice after fetching the list e.g. to view About or to actually download.  The mod.conf file name/path could be saved in SBMD if required.
>>> 
>>> Martin
>>> 
>>> 
>>> On 11 January 2016 at 01:39, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
>>> I’m trying to figure out how to reload a conf from a remote source (to go from a partial load to a full load).  The problem is that the AbstractSwordInstaller sits over top of mods.d.tar.gz, which it does not unpack. Instead, it iterates over all the entries in that binary file and handles each entry (i.e. a conf) in core. It doesn’t hit the disk. I’m wondering whether it would be alright to unpack the file in the same folder? That would allow a SwordBookMetaData to reload the file. It would also mean that SwordBookMetaData would only need one means of reading a conf as it’d be a file and not a byte array.
>>> 
>>> It isn’t a problem with desktop or server apps, but it might be for AndBible.
>>> 
>>> — DM
>>> 
>>> 
>>> 
>>>> On Jan 10, 2016, at 3:31 PM, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
>>>> 
>>>> The problem you encountered was 2 bugs:
>>>> When the module is not UTF-8 the remote repository’s conf is re-read, but the filter wasn’t passed.
>>>> Not intended, but IniSection required a filter, rather than saying a null filter meant everything passed.
>>>> 
>>>> I’ve checked in that fix. Still trying to make the memory less….
>>>> 
>>>> — DM
>>>> 
>>>>> On Jan 10, 2016, at 1:18 PM, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
>>>>> 
>>>>> The “Partial load of conf file.’ was to load all of the things in a conf that the JSword engine needs to work with a module. I don’t know why the CrossWire repo is working for me but not for you. I’ll keep working on it today. The problem with the previous commit was fixed with the last commit. I wasn’t “adjusting” the module after loading to fill in things like BookDriver and BookCategory.
>>>>> 
>>>>> I’m wondering whether getting the list of Books from the installer creates a deep rather than a shallow copy of them.
>>>>> 
>>>>> Today I hope to make SwordBookMetaData even more lazy. It has a BookDriver and validates its storage when the repo is loaded. I plan to break one of my modules by renaming one of the files and see the impact. Chris and I have noticed that the FileState objects are not fully released. This actually is part of the design.
>>>>> 
>>>>> Anyway, I think it is going in the right direction. Reducing the memory 4x is a  good thing. The data structures within the IniSection may be too heavy. I may relax the requirement that it maintains the SWORD confs order. The idea was to be able to modify the provided conf, retaining its order. However, now we never modify that conf.
>>>>> 
>>>>> configAll was a deep clone of configSword. configAll adds in the contents of configJSword and then configFrontend. These last two are created even if not needed. We could make them lazy as well.
>>>>> 
>>>>> DM
>>>>> 
>>>>>> On Jan 10, 2016, at 11:07 AM, Martin Denham <mjdenham at gmail.com <mailto:mjdenham at gmail.com>> wrote:
>>>>>> 
>>>>>> Thanks for the quick response.  I have had a brief look at the new commits.
>>>>>> 
>>>>>> A lot of the attributes aren't being returned now so it is tricky to test and there are various errors but running the current tip 'Partial load of conf file. <https://github.com/crosswire/jsword/commit/80020f51c6a762d458ce8ae70007b78eadee1fb3>' the SBMD for eBible is now only a quarter of the original size at 10Mb which is fine but I still don't understand why it is so large for the minimal attribute set now being returned.
>>>>>> 
>>>>>> I get a lot of errors like:
>>>>>> SwordBookMetaData(492): Book not supported: malformed conf file for [BBE] no ModDrv found.
>>>>>> SwordBookMetaData(492): Malformed conf file: missing [BBE]Description=. Using BBE
>>>>>> 
>>>>>> and peculiarly the eBible repo seems to be the only repo I can use because all the others error.
>>>>>> 
>>>>>> I also tried the previous commit Cut the memory requirements of a SwordBookMetaData in half. <https://github.com/crosswire/jsword/commit/cc32ba8f1bb245932a747390d03874b2be70e9a1> but it did not work because basic attributes like language were not being returned.
>>>>>> 
>>>>>> I still don't understand why removing configSword should reduce memory by half because it should just be removing references to data that is also referenced from configAll, so it would reduce memory slightly but not much.
>>>>>> 
>>>>>> Martin
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 10 January 2016 at 04:14, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
>>>>>> OK. That’s done. Also accidentally introduced a bug with the last commit. It is noticeably fast.
>>>>>> 
>>>>>> Next up, allow for *a* SwordBookMetaData to be reloaded fully. This is needed to bring in all the other elements which are information only, such as About, in order to display info to the end user. Since the user will only look at one modules info at a time, it will load that one. You may need to change your code (hope not) to force that one to reload.
>>>>>> 
>>>>>> Give the code a try to see if it solves your out of memory error.
>>>>>> 
>>>>>> DM
>>>>>> 
>>>>>> 
>>>>>>> On Jan 9, 2016, at 9:06 PM, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
>>>>>>> 
>>>>>>> I’ll be adding a filter to IniSection. Something like:
>>>>>>> if  (filter.test(key)) {
>>>>>>> 	use the key
>>>>>>> } else {
>>>>>>> 	do nothing
>>>>>>> }
>>>>>>> 
>>>>>>> SwordBookMetaData will be responsible for building the filter. At least for a first go around. A single object should do.
>>>>>>> 
>>>>>>> DM
>>>>>>> 
>>>>>>>> On Jan 9, 2016, at 6:29 PM, DM Smith <dmsmith at crosswire.org <mailto:dmsmith at crosswire.org>> wrote:
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Yes, like you I have thought of streamlining conf loading for repo lists.  One idea I had was to enable specification of a filter to SwordBookMetaData to limit the conf values that are stored.
>>>>>>>> 
>>>>>>>> I was thinking of something similar. My ideas aren’t good enough to be put into practice, but some kind of flag indicating empty, partially or fully loaded. Empty would mean that it hasn’t gone to disk to get the conf. Partial means that it read everything, but threw away most as not interesting (since the conf does not have order you have to read and parse it all). Full would mean that nothing was pitched. SwordBookMetaData.getProperty would need to be changed to determine whether the key is in memory or might be on disk and do the right thing. Or we could keep getProperty as it is and if you want one of the fields that is not stored (e.g. About) you have to call reload().
>>>>>>>> 
>>>>>>>> Maybe we could also cache that info into a separate file(s)? When mods.d.tar.gz is updated then the cache would be recomputed. In doing the computation, each conf would be read then pitched. Basically, the storage would be o.c.c.utils.Ini, if one file or IniSection, if many files.
>>>>>>>> 
>>>>>>>> What do you think?
>>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> jsword-devel mailing list
>>>>>>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>>>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> jsword-devel mailing list
>>>>>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> jsword-devel mailing list
>>>>>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
>>>>> 
>>>>> _______________________________________________
>>>>> jsword-devel mailing list
>>>>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
>>>> 
>>>> _______________________________________________
>>>> jsword-devel mailing list
>>>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>>>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
>>> 
>>> 
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
>>> 
>>> 
>>> _______________________________________________
>>> jsword-devel mailing list
>>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
>> 
>> 
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
>> 
>> 
>> _______________________________________________
>> jsword-devel mailing list
>> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
>> http://www.crosswire.org/mailman/listinfo/jsword-devel
> 
> _______________________________________________
> jsword-devel mailing list
> jsword-devel at crosswire.org <mailto:jsword-devel at crosswire.org>
> http://www.crosswire.org/mailman/listinfo/jsword-devel <http://www.crosswire.org/mailman/listinfo/jsword-devel>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20160111/b5470adb/attachment-0001.html>
    
    
More information about the jsword-devel
mailing list