In order to get real performance advantages of multi-core machines, programmers need to build parallel applications. However, building this kind of application is a demanding and error-prone task. Many programming languages, e.g., Go, Scala, Java, Erlang, C#, and Lua, implement their own number of constructs for concurrent/parallel programming.

Considering the discrepancies among the many existing approaches for concurrent programming, we would like to know how programmers use them, in terms of frequency of use, the system evolution along time and if programs are becoming more concurrent along their versions. More generally, we would like to know what programming constructs developers actually use to build concurrent systems, especially if programmers are aware about evolution/transition of single-core to muilt-core.

On the one hand, knowing how commonly programmers use these constructs may help researchers to design new mechanisms or improve existing ones, based on development practice. In addition, it can point out the real needs of developers, not only in terms of new or improved mechanisms, but in terms of refactoring and reengineering tools and techniques that can help them to incorporate these mechanisms into existing systems.

On the other hand, developer awareness about these usage patterns might lead to more efficient use of existing abstractions. Finally, for both researchers and developers, it is important to understand trends in software engineering and only a large scale study can gather that kind of information.

In this work we present an empirical study targeting a large-scale Java open source repository. Our main goal is to answer these research questions:

  • RQ1 - Have developers embraced concurrent programming?
  • RQ2 - Have developers moved to library-based concurrency?
  • RQ3 - How quickly do developers start using the j.u.c. library?
  • RQ4 - How do developers protect accesses to shared variables?
  • RQ5 - Are developers using thread-safe data structures?
  • RQ6 - How often do developers employ condition-based synchronization?
  • RQ7 - Do developers worry about errors that might cause threads to end abruptly?

Groundhog Infrastructure

Groundhog Architeture

The crawlers are an extention of Crawler4j, an open source web crawler application, treating multithreaded and written in Java. We also implemented additional scripts to order project versions and to check if the target project is ready to be analyzed, fixing its structure when necessary. To collect concurrency metrics we used the JavaCompiler class to parse the source code and build parse trees. The trees are traversed and the metrics are extracted and stored in text files. Metrics collection consisted of counting numbers of lines, imports, class instantiations, methods invocation, class extensions, interface implementations, and uses of some Java keywords. Some collected metrics are: numbers of extends Thread, implements Runnable, import j.u.c, sync methods, sync blocks, Hashtable, HashMap, ConcurrentHashMap AtomicInteger, Lines of Code.