RQ3 : Whether and to what extent have the code clones in deep learning projects caused co-changes?
Glossary
- POC : Percentage of the clones.
- PCC : Percentage of co-changed clones.
- Size : Size of a project.
- LC : Loc of a clone.
- PCTC : Percentages of co-changed clones to all clones.
- PLOC : the number of lines of python code;
- NormPCTC : the percentage of co-changed in each line of clones;
The association between POC and PCC
Overview
0 - 5% | 5 - 10% | 10 - 15% | 15 - 20% | 20 - 25% | 25 - 30% | > 30% | |
---|---|---|---|---|---|---|---|
AvgPCC | 7.3% | 9.2% | 16.7% | 11.1% | 9.0% | 14.8% | 49.0% |
Parts
- part 1 : 0 - 5%
POC | PCC | |
---|---|---|
deepvariant | 4.5% | 31.3% |
pytorch-lightning | 4.0% | 0 |
GPflow | 3.4% | 0 |
OpenNMT-py | 3.4% | 0 |
torchio | 3.2% | 0 |
tensorpack | 1.1% | 12.5% |
Avg | 3.2% | 7.3% |
- part 2 : 5 - 10%
POC | PCC | |
---|---|---|
DeepLabCut | 9.5% | 16.7% |
DeepPavlov | 8.7% | 2.0% |
PySyft | 8.4% | 0 |
raster-vision | 8.2% | 0 |
MONAI | 7.0% | 32.3% |
spaCy | 6.9% | 0 |
ray | 6.4% | 5.9% |
Hub | 5.2% | 16.7% |
Avg | 7.6% | 9.2% |
- part 3 : 10 - 15%
POC | PCC | |
---|---|---|
ludwig | 15.0% | 38.7% |
catalyst | 13.5% | 39.5% |
tensor2tensor | 13.4% | 0.7% |
allennlp | 12.2% | 7.5% |
luminoth | 11.9% | 4.5% |
coach | 10.8% | 17.9% |
clearml | 10.5% | 17.3% |
TTS | 10.3% | 7.0% |
Avg | 12.2% | 16.7% |
- part 4 : 15 - 20%
POC | PCC | |
---|---|---|
tfx | 20.0% | 3.7% |
chainer | 19.2% | 2.4% |
autokeras | 16.7% | 18.8% |
torch-points3d | 16.0% | 10.2% |
stellargraph | 15.1% | 20.6% |
Avg | 17.4% | 11.1% |
- part 5 : 20 - 25%
POC | PCC | |
---|---|---|
nni | 23.8% | 3.1% |
texar | 23.1% | 17.5% |
horovod | 22.5% | 2.8% |
sonnet | 22.4% | 0 |
transformers | 21.6% | 6.6% |
keras | 21.2% | 16.0% |
DIG | 20.6% | 23.7% |
addons | 20.1% | 2.6% |
Avg | 21.9% | 9.0% |
- part 6 : 25 - 30%
POC | PCC | |
---|---|---|
TensorLayer | 29.7% | 14.2% |
imgaug | 28.8% | 15.9% |
deepchem | 28.2% | 4.5% |
pyod | 25.7% | 24.7% |
Avg | 28.1% | 14.8% |
- part 7 : > 30%
POC | PCC | |
---|---|---|
tianshou | 51.1% | 77.9% |
ignite | 38.9% | 7.0% |
tflearn | 35.4% | 62.1% |
Avg | 41.8% | 49.0% |
Association between Size and PCC
Subject | PLOC | PCC |
---|---|---|
tianshou | 21,257 | 75.3% |
tflearn | 10,297 | 62.1% |
catalyst | 30,581 | 39.3% |
ludwig | 42,097 | 38.7% |
MONAI | 72,288 | 32.3% |
deepvariant | 35,254 | 30.6% |
pyod | 10,769 | 24.7% |
DIG | 22,257 | 23.1% |
stellargraph | 27,816 | 20.6% |
autokeras | 10,417 | 18.8% |
coach | 24,709 | 17.9% |
texar | 31,757 | 17.1% |
clearml | 84,867 | 17.3% |
DeepLabCut | 30,419 | 16.7% |
Hub | 22,636 | 16.7% |
keras | 146,799 | 15.9% |
imgaug | 89,115 | 15.9% |
TensorLayer | 29,952 | 14.2% |
tensorpack | 24,811 | 12.5% |
torch-points3d | 25,560 | 10.2% |
allennlp | 56,320 | 7.5% |
TTS | 25,247 | 7.0% |
ignite | 41,000 | 6.9% |
transformers | 449,759 | 6.6% |
ray | 260,916 | 5.9% |
luminoth | 11,467 | 4.5% |
deepchem | 60,320 | 4.5% |
tfx | 80,696 | 3.7% |
nni | 80,385 | 3.1% |
horovod | 36,344 | 2.8% |
addons | 28,214 | 2.6% |
chainer | 132,991 | 2.4% |
DeepPavlov | 27,778 | 2.0% |
tensor2tensor | 86,231 | 0.7% |
GPflow | 20,484 | 0 |
OpenNMT | 14,224 | 0 |
PySyft | 58,479 | 0 |
pytorch-lightning | 60,251 | 0 |
raster-vision | 16,718 | 0 |
sonnet | 13,172 | 0 |
spaCy | 76,679 | 0 |
torchio | 12,408 | 0 |
DeepFaceLab | 12,831 | 0 |
faceswap | 28,756 | 0 |
pytorchvideo | 19,058 | 0 |
We calculated the Pearson correlation coefficient between the size and percentage of co-changed clones of each project,which get a value of 0.05.
Association between LC and co-changed clones.
Association between LC and PCTC.
Overview
5-20 | 21-40 | 41-60 | 61-80 | 81-100 | 101-∞ | |
---|---|---|---|---|---|---|
PCTC | 5.7% | 11.9% | 8.9% | 33.5% | 48.5% | 60.4% |
Distribution of the number of each co-modified clone type.
5-20 | 21-40 | 41-60 | 61-80 | 81-100 | 101-∞ | |
---|---|---|---|---|---|---|
Type1 | 41 | 159 | 38 | 24 | 58 | 78 |
Type2 | 313 | 123 | 30 | 32 | 12 | 82 |
Type3 | 1512 | 1753 | 582 | 538 | 253 | 598 |
Total | 1866 | 2035 | 650 | 594 | 323 | 758 |
Distribution of the number of each co-modified clone type in every project
Distribution of the number of each clone type.
5-20 | 21-40 | 41-60 | 61-80 | 81-100 | 101-∞ | |
---|---|---|---|---|---|---|
Type 1 | 2665 | 1387 | 208 | 54 | 92 | 164 |
Type 2 | 5830 | 1341 | 357 | 97 | 28 | 117 |
Type 3 | 24055 | 14429 | 6746 | 1621 | 546 | 973 |
Total | 32550 | 17157 | 7311 | 1772 | 666 | 1254 |
Distribution of the number of each clone type in every project.
Association between LC and NormPCTC.
Overview
5-20 | 21-40 | 41-60 | 61-80 | 81-100 | >101 | |
---|---|---|---|---|---|---|
NormPCTC | 8.7% | 13.5% | 18.8% | 33.4% | 44.4% | 49.3% |
Distribution of the gross LOC of each co-changed clone type in every group.
5-20 | 21-40 | 41-60 | 61-80 | 81-100 | 101-∞ | |
---|---|---|---|---|---|---|
Type 1 | 636 | 1209 | 1007 | 1048 | 1999 | 1363 |
Type 2 | 2269 | 2503 | 1359 | 1329 | 377 | 2473 |
Type 3 | 6145 | 14358 | 9792 | 8801 | 5978 | 19218 |
Total | 9050 | 18070 | 12158 | 11178 | 8354 | 23054 |
Distribution of the gross LOC of each co-changed clone type in every group and project
Distribution of the gross LOC of each clone type in every group.
5-20 | 21-40 | 41-60 | 61-80 | 81-100 | 101-∞ | |
---|---|---|---|---|---|---|
Type 1 | 12696 | 13866 | 5226 | 2777 | 2969 | 2040 |
Type 2 | 25450 | 22074 | 8937 | 4372 | 1630 | 6575 |
Type 3 | 65484 | 98066 | 50401 | 26344 | 14200 | 38155 |
Total | 103630 | 134006 | 64564 | 33493 | 18799 | 46770 |