java.util.zip.ZipFile performance problems

I've been extremely pleased with the J# technology, however I am experiencing serious performance problems using java.util.zip.ZipFile to open large zip files. My test is this simple:
public static void main(String[] args)
throws Exception
{
long t1 = System.currentTimeMillis();
ZipFile zip = new ZipFile(args[0]);
long t2 = System.currentTimeMillis();
System.out.println("Time: " + (t2-t1) + "ms");
}
Opening a large zip file (such as HotSpot's rt.jar) takes upwards of 10sec on a 3GHz dual core machine. Running the same code with other Java VMs is less than 10ms.
I've tried this in both .NET 1.1 and .NET 2.0 Beta 2 (on Windows XP). Is this a known problem? Is there a workaround or a fix?
Thanks much!
Brian
[776 byte] By [insanepickle] at [2008-2-22]
# 1
Do you need randon access or can you read sequentially with ZipInputStream?
EricMolitor at 2007-9-9 > top of Msdn Tech,Visual J#,Visual J# General...
# 2
We need random access and the ability to quickly lookup a ZipEntry by name.
insanepickle at 2007-9-9 > top of Msdn Tech,Visual J#,Visual J# General...
# 3
Well I've not tried this with J# but used it to work around a weird jdk 1.1 issue with Kaffe. Grab the ANT source from apache (http://ant.apache.org) and try to use their implementation in org.apache.tools.zip.ZipFile

Its a clean implementation that doesn't extend java.util.ZipFile and its quite fast. Let me know if you try it (and if its faster.) The apache license shouldn't taint your code but read it if you choose to use it.

Hopefully Microsoft will address this but at least its another option.

Cheers,
Eric

EricMolitor at 2007-9-9 > top of Msdn Tech,Visual J#,Visual J# General...
# 4
Thanks Eric,
I was unaware that Ant had re-implemented the java.util.zip APIs. My backup plan was going to be SharpZipLib - but using the Ant Java code is a bit cleaner. I had to tweak the code just a bit to get it to compile under J#. The results were interesting. Times to open and read one small entry in a 21MB zip on a high end PC:
HotSpot java.util.zip: 16ms
J# java.util.zip: 4700ms
HotSpot ant: 5140ms
J# ant: 470ms
So it does provide an order of magnitude performance improvement in J#, but isn't something you want to use in HotSpot. I'll probably rig it to use one or the other based on the environment. I'll let yo know if I dig into it anymore.
Even better, maybe someone from Microsoft will comment.
Thanks,
Brian
insanepickle at 2007-9-9 > top of Msdn Tech,Visual J#,Visual J# General...
# 5
Hi Brian,

I could repro the issue for a 24 MB zipped file.

This is an important feedback and we are aware of performance issue here.
We are trying to do perf investigations at this point of time.

1) What is the overall scenario in which you hit this issue?

2) Is this a blocking issue for you?

3) For the time being, did you consider using equivalent ZipFile API in .NET Fx?

Thanks,
Varun



VarunGupta at 2007-9-9 > top of Msdn Tech,Visual J#,Visual J# General...
# 6
Hi Varun,
Thanks for your reply.
Our application is a compiler which needs to open jar files to do type checking against the classfiles. That is why we are particulary concerned with big zip files since rt.jar tends to be really big. So I guess I would describe that as a blocking issue since the compiler can't lookup any zip entries until the ZipFile constructor completes.
We are using J# so that our code can run under HotSpot and .NET, so we would prefer to have java.util.zip work seamlessly. Although if we have to use a different zip library when running under .NET it isn't the end of the world.
I am unaware of what ZipFile API you are refering to. Did .NET 2.0 add a Zip library? If so could you please point me towards the documentation.
Thanks much,
Brian
insanepickle at 2007-9-9 > top of Msdn Tech,Visual J#,Visual J# General...