The Java community on Twitter

Clement Levallois
4 min readMar 9, 2021

Java is one of the key programming languages used in today’s digital world

an informal poll created by Jason Warner, Github’s CTO

Java powers Android phones, many of your desktop applications, and much of the IT infrastructure which supports the digital services we all use daily.

Java is so big, but how to get a sense of its community and ecosystem?

People and organizations with an interest in Java often have an account on Twitter. Using a methodology I developped with co-authors, it is possible to map the communities contributing to the Java ecosystem, and their related topics. Here is the result. See at the bottom for additional comments on the methodology.

The global view

We get 2,913 accounts groups in different communities. The picture above shows the main communities of interest, and the terms and key users in them.

While this is broadly interesting, this picture hides interesting details. We can pick each of these colored communities and zoom on each of them, to reveal subcommunities and subtopics.

Here are the key communities, click on them to zoom / high res version:

The next step would be of course to choose one of these communities of interest, to see who is in it, so that you can connect to them on Twitter or just do some networking. Unfortunately, the Twitter terms of service prevent me from publishing this information. I’ll try to do my best and answer your questions if you get in touch (Clement Levallois).

The methodology

The full details are available in this open access article, co-authored with Mohamed Benabdelkrim, Jean Savinien and Céline Robardet.
The basic steps are:

  • 🎯 selecting “seed accounts” that are typical of a given field. I selected @java @Java_Champions, @Devoxx, @devnexus, @JavaAtMicrosoft, @JakartaEE, @EclipseJavaIDE, @intellijidea. Chosing different seeds would have led to different results.
  • 🛴 collect the public Twitter lists that these seed accounts belong to.
  • 🛴 collect the list of Twitter accounts that belong to these lists.
  • 🔁 rinse and repeat: collect the public lists of these lists of accounts, collect the accounts in these lists (here we do some sampling obviously).
  • 👬 do pair-wise comparisons: a connection is made between 2 accounts if they are found to be in many lists in common.
  • 🚿 the network we obtain is then cleaned in many ways.
  • ✍️ textual information is retrieved from the public bios of the accounts and is cleaned in many ways (per language, lemmatized, we use ngrams, etc.)
  • 🕵️‍♂️ communities are detected with the Louvain algorithm implemented in Gephi
  • 🗺️ maps of the networks are made with the Gephi toolkit, which is an amazing network library built on top of the Apache NetBeans platform.
  • ⚙️ the workflow of designing maps on slides is automated with the apis of Google Slides. Text annotations and arrows are generated and placed programmatically at this step.
  • ⏳ from the selection of seeds to the annotated maps, the process takes about half an hour.
  • ♨️ all this is coded in… Java :-)

--

--

Clement Levallois

PhD, social scientist & data visualization specialist @EMLyon. @Gephi support team. #OpenAccess promoter. #Java dev.