Weka 3.5.3 and JVM with 1.4 – 1.6 GB memory limit workaround

Feeds:: Posts; Comments

Weka 3.5.3 and JVM with 1.4 – 1.6 GB memory limit workaround

February 26, 2008 by invictium

I’m running Weka 3.5.3 on a server with 4 GB memory with sampled-data set as large as 24.000 rows with 19 fields. When i run a simple J4.8 classification, the JVM showed-up an out-of-memory exception.

As a workaround, i fiddle RunWeka.ini and and create my own windows-based short-cut as follows (i needed the LibSVM, but if you dont require libSVM then you can simply remove libsvm.jar from the line):

C:WINDOWSsystem32javaw.exe -Xmx1400m -classpath “%CLASSPATH%;weka.jar;libsvm.jar” weka.gui.GUIChooser

but before that i make sure that i have set the correct CLASSPATH by pasting this command line below into windows prompt :

set CLASSPATH=C:Program FilesWeka-3-5weka.jar;%CLASSPATH%

Now I have succesfully increased the Java Virtual memory but i wasn’t able to run J4.8 because another out-of-memory exception was thrown again. I have tried to increase the Xmx option to 1600M but JVM said that it cant allocate the required memory.

After googling around i have found out 2 interesting things/facts that i have learn the hard way:

1. Windows 32-bit have a memory limit of 4GB because of max memory addressing issue on a 32-bit system, that’s minus the shared memory you use for VGA display. So basically you will only be able to access 3.8GB at max. If you have larger memory than 4 GB (let say 10GB), you would need to add /PAE to allow your windows to utilize this memory (convert it into 36-bit).

2. Windows 32-bit default setting allocates 2GB user-space and 2GB kernel-space. User-space is the amount of memory that can be allocated to 1 process (javaw.exe). However windows setting allows you to change the user-space allocation to 3GB. This can be done by changing windows boot option to /PAE /3GB, then reboot the box. This option is accessible by left-clicking on My Computer -> Properties -> Advanced -> Startup and Recovery. You will see the box like below :

(click edit and notepad should appear, add the parameters, save, verify, reboot, and you should go fine..)

3. Although i have activated point 2 settings above, the JRE doesn’t seem to be able to allocate VM more than 1.6 GB. This is the major disadvantage of building data mining application under JAVA and also a typical JRE problem on 32-bit machine but this wasn’t the case for Sun-based O/S.

To break this memory barrier, i use Bea JRockIt 5 (30MB) :

http://download2.bea.com/pub/jrockit/50/jrockit-R27.5.0-jre1.5.0_14-windows-ia32.exe

Please note that the trial version would run only for 1 hours.

After i installed it on my machine, i change the shortcut on point 1 (-Xmx) to 2800M (i think this is the max number for 32-Bit), double click on it, then voila.. It works!

I have also test this with Weka 3.5.7 and it runs smoothly.

So basically if you want to run Weka for more that 1.6 GB memory you need to:

1. Activate the -Xmx settings

2. Activate /3GB on your windows

3. Install JRock IT

Now i was able to run the J4.8 classification without OOM warning. However for larger data-set i still dont have any clue on how to run it. Perhaps in that case, running Weka on SPARC box or running parallel-Weka or compiling weka source code to windows DLL ( using JET Excelsior ) might save you.

The result of memory utilization (> 2GB limit) can be viewed by clicking on the picture below :

Posted in Science Spirit, Techie Spirit | Tagged data mining, memory limit, weka | 6 Comments

6 Responses

on August 5, 2009 at 8:45 pm mwojnars

You might be interested in Debellor (www.debellor.org). It is an open source framework for scalable data mining – processing large volumes of data without OutOfMemory exceptions – this is possible thanks to stream-oriented architecture, where algorithms may be connected in a pipeline and process data samples on-the-fly, without buffering in memory. A part of Weka algorithms are integrated with Debellor, so you can use them in connection with your own implementations.
on August 18, 2009 at 11:11 pm invictium

Thank you mwojnars,
I have read the presentation and the paper and so far it looks promising… 🙂
I’ll give a closer look at your work (Debellor) and hopefully I can test & review it using my 1M-rows dataset,…
PS: Debellor w/ source-code would be better because more people can test and improve your work…. 🙂

Good work!
on September 9, 2009 at 3:41 pm mwojnars

Hi!
You’ll find Debellor’s source code in the same JAR as byte code – debellor.jar – if you open it as a compressed file you’ll find .java and .class files together. Another way is to view or download sources directly from SVN (password not needed):

https://debellor.svn.sourceforge.net/svnroot/debellor
on December 17, 2010 at 1:10 am carpathiananonymous

Hello

Thanks for your help. I managed to change the -Xmx value only by installing JRockIT (step 3). I was running Weka on Windows7 32-bit, and I couldn’t make the value of -Xmx larger then 1066m. After I installed JRockIT, I modified this value (in RunWeka.ini) to larger extents and the machine started just fine. Hope this will help 🙂
on December 17, 2010 at 1:28 am invictium

how big is you physical memory… ?
I had 4GB back then (waste of MB on 32 bit system).. and the memory level stayed consistently at 2800MB in my case..
on January 4, 2011 at 8:37 pm carpathiananonymous

I have 3GB of memory.
It seems that after I installed JRockIT and it had overridden the public Java Runtime Environment (JRE) I am able to start Weka with the desired amount of memory.
The problem is that now the classifier isn’t working properly, it doesn’t do anything. I let it run several hours and I had no answer.

Comments RSS

	carpathiananonymous on Weka 3.5.3 and JVM with 1.4…
	invictium on Weka 3.5.3 and JVM with 1.4…
	carpathiananonymous on Weka 3.5.3 and JVM with 1.4…
	mwojnars on Weka 3.5.3 and JVM with 1.4…
	invictium on Weka 3.5.3 and JVM with 1.4…

Invictium

…the essence of being unconquerable…