Cleaning up unused dependencies in Maven projects

Maven may well be the most widely used build tool in the Java space and almost everyone has had at least some exposure to it. Personally, it’s been at the heart of most of the projects I’ve worked on for many years but I have a bit of a love-hate relationship with Maven.

One of the main benefits of Maven is it’s ability to manage and download dependencies for your projects. It will also grab the dependencies of your dependencies (transitive dependencies). This means you can depend on a third party libary X and not have to worry about having to specifiy all the dependencies of X in your pom files. However as your project grows you may start to loose touch with what libaries are actually being deployed with your applications. The largest project I’m currently working on has 228 pom.xml files and recently I checked our built wars to see just what libs are making it into production. To be honest it was a bit of a shock with some war files containing multiple versions of some third party libraries and some even containing testing jars and libs such as mockito!

This doesn’t mean we are bad developers or sloppy with our projects but it means one small dependency can pull in others with it and sometimes if the scoping isn’t set correctly you end up with all manner of unwanted things making their way into your production environment.

Dependency Management

One way to help better control what is being used is to specify dependencies and versions in the <dependencyManagement> section of a “bom” file which can be imported into your project pom files. This will force that specified version to be used throughout all projects which use that bom.

One thing to note here is that even if dependncy Y is in your dependency management and you specify version 1.5 then you have a dependency X in your project and it needs version 2.0.3 of dependncy Y you will still end up with version 1.5 in your final build as your dependency managed version has priority. This can cause compile time or even runtime issues if X suddenly finds the wrong method signatures in Y for example.

The Problem

Over the last 8+ years my project has grown to over 200 pom files having 602 uniq dependencies (both internal and external libraries). Chances are that over time some of these are no longer actually being used and are just taking up space in my final war files but Maven doesn’t have a good way of working out if a dependency is actually needed.

You can run mvn dependency:analyze to show unused declared dependencies and used undeclared dependencies.

[WARNING] Used undeclared dependencies found:
[WARNING]    net.sf.uadetector:uadetector-resources:jar:2014.10:compile
[WARNING]    net.sf.uadetector:uadetector-core:jar:0.9.22:compile
[WARNING]    org.apache.avro:avro:jar:1.7.7:compile
[WARNING]    org.springframework:spring-context:jar:4.0.9.RELEASE:compile
[WARNING] Unused declared dependencies found:
[WARNING]    ch.qos.logback.contrib:logback-json-classic:jar:0.1.2:compile
[WARNING]    ch.qos.logback.contrib:logback-jackson:jar:0.1.2:compile
[WARNING]    com.fasterxml.jackson.core:jackson-databind:jar:2.5.1:compile
[WARNING]    org.kubek2k:springockito:jar:1.0.10-SNAPSHOT:test

Adding -DoutputXML=true causes it to output the dependencies in xml format so you can paste them directly into your pom file. The problem with this is it is a static analysis of your code and doesn’t take into account things that are needed at runtime.

 

A Google for the problem of cleaning up pom files turned up a ruby script and blog post here: https://samulisiivonen.blogspot.co.nz/2012/01/cleanin-up-maven-dependencies.html The basic idea here is to systematically remove a single dependency from a pom file and run a command (probably mvn clean install -N – the -N stops maven from recursing into sub projects). If the maven command succeeds the dependency isn’t needed and can be removed otherwise its put back in the pom and the next one is removed.

I ran this script over my code and 19hrs later it had removed lots of dependencies but it turns out that this was kinda the opposite of what I wanted. Remember Maven deals with transitive dependencies for you so just because the Maven command succeeds doesn’t mean you have removed an unused dependency as it may still be present transitively through another dependency. The results of this ruby script are a minimal pom needed to compile and test your project. The other downside is that if you systematically remove each dependency in a different order you may get a different resulting pom file depending on what dependencies your dependencies depend on 🙂

What I want is a comprehensive list of all dependencies for each module in my application and this list should contain only dependencies that are needed. My plan was to generate a full list of dependencies (including all transient ones) and then use the enforcer plugin to ban transitive dependencies. Once that is done I could use the original script from the blog above to remove only the ones that genuinely aren’t being used.

The above shows the configuration of the enforcer plugin to enforce that all dependencies are explicitly listed in the poms.

Using the above ruby script as inspiration I looked into using the effective pom mvn help:effective-pom -Doutput=effective-pom.xml to generate a pom and then testing each dependency as the above script had done. This basically flattens the pom and its parents into a single pom file but doesn’t take into account transitive dependencies still so the resulting file isn’t a full list of dependencies so this didn’t work.

The final script I came up with is below. It uses mvn dependency:list to generate a full list of all dependencies and then it modifies the pom file to add them all in.

Ruby isn’t a language I use often so the code probably isn’t the best but I must admit I do have a bit of a soft spot for it.

So after running my script and then the remove-extra-dependencies.rb script I ended up with pom files which were in some cases smaller and in some cases larger (as all transient dependencies are now explicit) but what really matters is the final wars.

I found that my apps didn’t run first time due to some dependencies being used at runtime that weren’t captured by my process above so there was a little manual manipulation of the pom files after the scripts had run. Maybe my unit tests didn’t cover all the code in the way it was run in a production like environment. I also found that I somehow ended up with a few libs that had multiple versions again and these needed to be managed in dependency management.

Conclusion

Overall it was a fun exercise to look into how all the jars end up in my final applications. It did take a lot of time to iron out the process of identifying unused dependencies and the results of the automated scripts still needed a little manual massaging to make the apps run again afterwards. The resulting wars were smaller, sometimes having 6 – 12 less jars in them than I started with. As to whether that is a worthwhile gain is debatable I guess. It depends on your applications use case.

I think the moral here is to take better care of your dependencies from day one and not let them get as out of control as we had.