without code snippet:
Tags | SO Questions | SE Questions | PM Questions | Total |
---|---|---|---|---|
domain-driven-design |
3507 | 835 | ——— | 4497 |
agile |
965 | 1093 | 1351 | 3410 |
devops |
3156 | 73 | 21 | 3342 |
scrum |
729 | 717 | 1534 | 2981 |
development-process |
31 | 676 | 104 | 813 |
kanban |
173 | 47 | 309 | 532 |
waterfall |
72 | 46 | 68 | 191 |
sdlc |
64 | 88 | 11 | 163 |
extreme-programming |
45 | 55 | 17 | 117 |
agile-project-management |
77 | ——— | ——— | 77 |
prince2 |
——— | 3 | 56 | 59 |
agile-processes |
53 | ——— | ——— | 53 |
mda |
47 | ——— | ——— | 47 |
kanban-board |
——— | ——— | 42 | 42 |
model-driven-development |
39 | ——— | ——— | 39 |
rational-unified-process |
18 | 13 | ——— | 31 |
rup |
23 | ——— | 5 | 28 |
safe |
——— | ——— | 24 | 24 |
iterative-development |
——— | 22 | ——— | 22 |
lean |
——— | 22 | ——— | 22 |
scrumban |
——— | ——— | 17 | 17 |
dsdm |
——— | ——— | 12 | 12 |
scaled-agile-framework |
——— | 9 | ——— | 9 |
personal-software-process |
7 | 2 | ——— | 9 |
nexus |
——— | ——— | 7 | 7 |
feature-driven |
3 | ——— | ——— | 3 |
large-scale-scrum |
——— | 3 | ——— | 3 |
dsdm-atern |
2 | ——— | ——— | 2 |
Total (Remove Duplicates) | 8486 (8404) | 2978 (2346) | 2439 (2372) | 13903 (13082) |
code snippets
, HTML tags
.lower-case words token
, and remove punctuation signs
.common English words (i.e.,stop-words)
, did lemmatization
, keeping only noun, adj, vb, adv.stemming
). We used the improved version of the Porter stemming for this purpose [20].removed the words that appeared less than 10 times in total
.Latent Dirichlet Allocation (LDA) modeling technique [13] was adopted for this paper.
We calculated topic coherence scores for a range of 2-100 topics where we fixed alpha, beta with default, and arrived at 15 as the value in the range of the highest values (i.e. most coherent topic model).
topics | c_v |
---|---|
4 | 0.49238079391933737 |
15 | 0.4803587450047443 |
5 | 0.48020645922082095 |
3 | 0.4800883619689366 |
6 | 0.47612252341174427 |
12 | 0.47392385923203323 |
7 | 0.4688837055506901 |
17 | 0.4658745233276905 |
18 | 0.4642817316407959 |
10 | 0.463303793989635 |
Next, we calculated the optimal hyperparameter alpha and beta parameters with 15 topics which have highest coherence score, and finally got the best parameter value.
alpha | beta | topics | c_v |
---|---|---|---|
0.41 | 0.81 | 15 | 0.4953024941697182 |
We assigned a descriptive label to each of the topics. To do so, for each topic we carefully analyzed the top 10 terms and manually checked some of the highly relevant documents of the topic.
Topic Index | Top LDA words | Top 10 doucments | Topic Label | Documents Count and Rate |
---|---|---|---|---|
4 | 0.025*"time" + 0.016*"chang" + 0.013*"problem" + 0.010*"new" + 0.008*"much" + 0.008*"tri" + 0.008*"veri" + 0.007*"even" + 0.007*"start" + 0.007*"onli” | Topic #4 top 10 documents | Project, Team and Time Management | 7933 (0.57) |
2 | 0.053*"domain" + 0.045*"model" + 0.023*"applic" + 0.022*"datum" + 0.022*"design" + 0.021*"servic" + 0.017*"layer" + 0.015*"class" + 0.014*"implement" + 0.013*"databas” | Topic #2 top 10 documents | Architecture (Domain model, Design patterns and layers) in DDD | 3740 (0.27) |
14 | 0.033*"design" + 0.028*"requir" + 0.019*"system" + 0.014*"user" + 0.012*"function" + 0.012*"softwar" + 0.012*"differ" + 0.012*"implement" + 0.011*"document" + 0.011*"write” | Topic #14 top 10 documents | Software Design and Requirements | |
3561 (0.26) | ||||
7 | 0.046*"aggreg" + 0.038*"object" + 0.032*"entiti" + 0.029*"root" + 0.027*"state" + 0.018*"valu" + 0.017*"model" + 0.016*"valid" + 0.016*"order" + 0.015*"rule” | Topic #7 top 10 documents | Entities, Value Objects and Aggregates in DDD | 2889 (0.21) |
11 | 0.117*"user" + 0.025*"task" + 0.020*"ticket" + 0.019*"creat" + 0.019*"group" + 0.018*"item" + 0.018*"want" + 0.016*"add" + 0.014*"board" + 0.013*"custom” | Topic #11 top 10 documents | Task Management in Kanban Board | |
2935 (0.21) | ||||
13 | 0.084*"stori" + 0.077*"sprint" + 0.036*"team" + 0.035*"estim" + 0.027*"task" + 0.025*"point" + 0.024*"plan" + 0.023*"backlog" + 0.017*"product" + 0.015*"complet” | Topic #13 top 10 documents | Story Estimation in Scrum Sprint | |
2964 (0.21) | ||||
3 | 0.040*"server" + 0.037*"machin" + 0.026*"applic" + 0.021*"run" + 0.018*"environ" + 0.018*"configur" + 0.016*"servic" + 0.013*"error" + 0.012*"local" + 0.011*"app” | Topic #3 top 10 documents | Services and tools in DevOps | 2809 (0.2) |
8 | 0.043*"tool" + 0.021*"sourc" + 0.017*"document" + 0.016*"web" + 0.015*"com" + 0.014*"app" + 0.013*"page" + 0.012*"find" + 0.011*"file" + 0.011*"look” | Topic #8 top 10 documents | Tools and Plugins in Agile Software Development | 2833 (0.2) |
0 | 0.044*"branch" + 0.044*"releas" + 0.039*"build" + 0.036*"deploy" + 0.034*"version" + 0.029*"featur" + 0.027*"merg" + 0.019*"chang" + 0.017*"commit" + 0.013*"environ” | Topic #0 top 10 documents | Continues Integration, Build and Deployment | 2587 (0.19) |
5 | 0.147*"team" + 0.056*"scrum" + 0.049*"product" + 0.036*"develop" + 0.021*"manag" + 0.018*"member" + 0.017*"owner" + 0.015*"peopl" + 0.014*"master" + 0.013*"role” | Topic #5 top 10 documents | Team Roles in Scrum | 2462 (0.18) |
1 | 0.062*"event" + 0.053*"context" + 0.033*"bound" + 0.020*"read" + 0.019*"microservic" + 0.018*"servic" + 0.017*"system" + 0.016*"command" + 0.015*"datum" + 0.014*"messag” | Topic #1 top 10 documents | Events, Bounded Contexts and Microservices in DDD | 1819 (0.13) |
9 | 0.084*"agil" + 0.049*"softwar" + 0.045*"develop" + 0.041*"process" + 0.032*"custom" + 0.019*"methodolog" + 0.018*"requir" + 0.018*"iter" + 0.016*"scrum" + 0.013*"deliv” | Topic #9 top 10 documents | Software Development Methodology concepts | 1836 (0.13) |
12 | 0.141*"test" + 0.106*"code" + 0.064*"develop" + 0.036*"bug" + 0.026*"review" + 0.026*"write" + 0.021*"unit" + 0.017*"fix" + 0.014*"qa" + 0.013*"autom” | Topic #12 top 10 documents | Software Tests (unit tests, integration tests, acceptance tests, tdd) | 1479 (0.11) |
6 | 0.255*"project" + 0.068*"manag" + 0.038*"client" + 0.030*"compani" + 0.020*"contract" + 0.018*"cost" + 0.016*"scope" + 0.013*"resourc" + 0.012*"deadlin" + 0.011*"plan” | Topic #6 top 10 documents | Project Managers’ Responsibility and Contract Management | 905 (0.07) |
10 | 0.066*"day" + 0.056*"meet" + 0.032*"daili" + 0.030*"hour" + 0.019*"stand" + 0.017*"minut" + 0.015*"time" + 0.015*"activ" + 0.014*"schedul" + 0.014*"employe” | Topic #10 top 10 documents | Meetings in Scrum Team | |
499 (0.04) |