Improved similarity measures for software clustering




















To combine the strengths of various to change it and also to evaluate the side effects of algorithms, Consensus Based Techniques a change. Coupling is the degree of dependency CBTs can be used, where more than one actors between the modules and Cohesion is the inter- e. Low coupling common goal. When combined with high cohesion, employs cooperation among more than one it provides high reliability and maintainability.

The similarity measures during the hierarchical important application of cluster analysis is to clustering process. Cooperative Clustering is modularize a software system by grouping together capable of showing significant improvement software entities that are similar or related to each over individual clustering algorithms for other and thereby achieve minimum coupling and software modularization.

Software clustering is the process of Keywords: Software clustering process, Feature decomposing a software system into meaningful vector cases, Cooperative software clustering subsystems. It is the process of finding similar groups of entities in data. Entities within a cluster I. Clustering A well-documented architecture can improve thus provides a high-level view of the system by the quality and maintainability of a software grouping together related or similar software system.

However, many existing systems often do entities. Software clustering plays an important role not have their architecture documented. A number of clustering software projects. Software clustering holds out the algorithms and measures have been proposed and promise of helping in this task. During the applied for software modularization. Traditionally, containing 4 entities E1—E4 and 6 binary features CBTs have been studied as the integration of more f1—f6.

This technique can be E2 1 1 1 0 0 1 implemented for clustering algorithms which are E3 1 0 0 1 0 1 iterative in nature [6], [7]. CCT allows parallel E4 1 0 0 1 1 0 execution of algorithms, which support each other by exchanging information at each iteration. The B. Selection of Similarity Measures intermediate-level cooperation suggested in CCT need not be restricted to cooperation between In the second step, a similarity measure is algorithms only; a broader view may be taken.

For applied to compute similarity between every pair of example, cooperation may be in the form of more entities, resulting in a similarity matrix. Selection than one similarity measure producing results at of a similarity measure should be done carefully, each iteration, which are combined during the because selecting an appropriate similarity measure clustering process. Cooperative techniques have may influence clustering results more than the been explored in many disciplines.

I accept. Polski English Login or register account. Improved Similarity Measures for Software Clustering. Naseem, R. Abstract Software clustering is a useful technique to recover architecture of a software system. The results of clustering depend upon choice of entities, features, similarity measures and clustering algorithms. Different similarity measures have been used for determining similarity between entities during the clustering process.

In software architecture recovery domain the Jaccard and the Unbiased Ellenberg measures have shown better results than other measures for binary and non-binary features respectively. In this paper we analyze the Russell and Rao measure for binary features to show the conditions under which its performance is expected to be better than that of Jaccard.

The results of clustering depend upon choice of entities, features, similarity measures and clustering algorithms. Different similarity measures have been used for determining similarity between entities during the clustering process. In software architecture recovery domain the Jaccard and the Unbiased Ellenberg measures have shown better results than other measures for binary and non-binary features respectively.

In this paper, we highlight cases where the Jaccard measure may fail to capture similarity between entities appropriately.

We propose a new similarity measure which overcomes these deficiencies. Our experimental results indicate the better performance of the new similarity measure for software systems exhibiting the defined characteristics.



0コメント

  • 1000 / 1000