Informing policy with text mining

technological change and social challenges

Supplementary materials

Supplementary tables

Various robustness checks

Charts

Interactive charts used in paper

Raw data and codes

Zenodo & Gitlab

Abstract

The fast development and adoption of ICT technologies and digital services have major social implications. Policy-makers often struggle to design appropriate regulations, defend the rights of citizens and ensure competition. The aim of this work is to present various methodologies that support stakeholders with identification of emerging technologies and related social challenges based on text mining of news articles. The analysis demonstrates that while each text mining algorithm provides insightful results, their combination yields more detailed and robust overview of media discussions. The results present early signals and trends, the relationships between technologies and social challenges, and changing attitudes towards selected tech issues.

Keywords:
online news, web scraping, text mining, sentiment analysis

Supplementary tables

Various robustness checks

Websites' categories included in the web-scraping process

Table OA.1

Source Sections No. of articles Weight
Euractiv Digital section 1728 5%
The Conversation Science+Technology section 2171 5%
Politico Europe Data and digitization, Technology, Sustainability sections 2372 5%
IEEE Spectrum Tech-talk, The-human-os, Riskfactor, Automaton, View-from-the-valley, Nanoclast, Cars-that-think 3074 5%
The Guardian Technology tag 6191 12%
Fastcompany Technology section 7381 5%
Techforge www.cloudcomputing-news.net, www.developer-tech.com, www.enterprise-cio.com, www.iottechnews.com, www.marketingtechnews.net, virtualreality-news.net, www.telecomstechnews.com, 8786 5%
Arstechnica Biz and IT, Tech, Science, Policy, Cars 16808 5%
Reuters Technology section 20071 5%
The Verge Tech, Science 30636 9%
Gizmodo All sections 30740 9%
The Register All sections 32908 12%
ZDNet Artificial intelligence, banking, data centers, data management, developer, digital transformation, e-commerce, enterprise software, EU, Future of Work, Google, Government, Great debate, Innovation, Internet of things, IT priorities, legal, mastering business analytics, networking, open source, reimagining the enterprise, robotics, security, smart cities, social enterprise, startups, tech industry, virtual reality, 3d printing 33904 9%
TechCrunch All sections 50730 9%
Source Sections No. of articles Weight

Links to the webscraped sources

Table OA.2

Source Link
The Register https://www.theregister.co.uk/
ZDNet https://www.zdnet.com/
Gizmodo https://gizmodo.com/
Reuters https://www.reuters.com/
Arstechnica https://arstechnica.com/
The Guardian https://www.theguardian.com/uk
Fastcompany https://www.fastcompany.com/
Techforge https://www.techforge.pub/
IEEE Spectrum https://spectrum.ieee.org/
Politico Europe https://www.politico.eu/
The Conversation https://theconversation.com/
Euractiv https://www.euractiv.com/
The Verge https://www.theverge.com/
TechCrunch https://techcrunch.com/
Source Link

Sources readability indices

Table OA.3

Sources Flesh readability std FOG index std Sentences counts std
Euractiv 42.8 10.0 14.5 2.3 32.0 21.0
Techforge 44.1 10.8 14.8 2.3 21.0 12.0
Politico 46.4 10.8 14.1 3.0 33.0 26.0
Reuters 46.6 9.7 14.6 2.5 15.0 10.0
ZDNet 47.8 11.7 13.9 2.4 25.0 20.0
IEEE 50.0 8.6 13.6 1.8 37.0 22.0
The Register 51.2 9.9 13.9 2.2 21.0 14.0
The Conversation 52.3 8.4 12.9 1.4 39.0 10.0
Arstechnica 53.2 9.4 13.0 1.8 29.0 18.0
The Guardian 53.4 10.6 13.4 2.2 34.0 28.0
Techcrunch 53.8 10.7 13.3 2.5 24.0 20.0
Fastcompany 54.2 10.5 13.2 2.6 34.0 34.0
Gizmodo 56.1 11.1 13.0 2.5 24.0 22.0
The Verge 57.0 9.7 13.0 2.2 21.0 25.0
Sources Flesh readability std FOG index std Sentences counts std

True positive rates

Table OB.1

growth top target top prediction TPR FPR
1 1.5 50 2000 0.580000 0.015233
2 1.5 50 1250 0.580000 0.009436
3 1.5 50 2500 0.580000 0.019097
4 1.5 50 1000 0.580000 0.007504
5 1.5 50 1500 0.580000 0.011368
6 2 50 2000 0.560000 0.015240
7 1.5 50 500 0.560000 0.003648
8 1.5 50 750 0.560000 0.005580
9 2 50 1500 0.560000 0.011376
10 2 50 2500 0.560000 0.019105
11 1.5 100 2500 0.550000 0.018903
12 1.5 100 1000 0.550000 0.007306
13 1.5 100 2000 0.550000 0.015038
14 1.5 100 1500 0.550000 0.011172
15 1.5 100 1250 0.550000 0.009239
16 2 50 1250 0.540000 0.009452
17 2 50 1000 0.540000 0.007520
18 1.5 100 750 0.520000 0.005397
19 1.5 250 2500 0.512000 0.018360
20 1.25 50 1500 0.500000 0.011399
21 1.25 50 2000 0.500000 0.015264
22 1.25 50 250 0.500000 0.001739
23 1.25 50 500 0.500000 0.003671
24 1.25 50 750 0.500000 0.005603
25 1.25 50 1000 0.500000 0.007535
26 1.25 50 1250 0.500000 0.009467
27 1.25 50 2500 0.500000 0.019128
28 1.5 250 2000 0.492000 0.014529
29 1.25 100 1500 0.490000 0.011218
30 1.25 100 1250 0.490000 0.009285
31 1.25 100 1000 0.490000 0.007353
32 1.25 100 750 0.490000 0.005420
33 1.25 100 500 0.490000 0.003487
34 1.25 100 2000 0.490000 0.015084
35 2 100 2500 0.490000 0.018950
36 1.25 100 2500 0.490000 0.018950
37 1.5 250 1500 0.480000 0.010682
38 2 50 750 0.480000 0.005611
39 1.5 100 500 0.470000 0.003502
40 2 100 2000 0.470000 0.015099
41 2 100 1500 0.470000 0.011234
42 1.25 500 2500 0.468000 0.017574
43 1.25 250 2500 0.468000 0.018445
44 1.25 250 2000 0.468000 0.014575
45 1.25 250 1500 0.468000 0.010705
46 1.25 250 1000 0.464000 0.006842
47 1.25 250 1250 0.464000 0.008778
48 1.25 500 2000 0.462000 0.013719
49 3 50 2500 0.460000 0.019143
50 1.25 100 250 0.450000 0.001585
51 1.25 500 1500 0.450000 0.009888
52 1.5 250 1250 0.448000 0.008809
53 1.25 250 750 0.448000 0.004938
54 3 50 1250 0.440000 0.009490
55 1.25 750 2500 0.440000 0.016862
56 2 100 1250 0.440000 0.009324
57 3 50 1500 0.440000 0.011423
58 3 50 2000 0.440000 0.015287
59 1.25 500 1250 0.426000 0.008042
60 1.5 500 2500 0.426000 0.017737
61 1.25 750 2000 0.424000 0.013070
62 1.5 50 250 0.420000 0.001770
63 3 100 2500 0.420000 0.019004
64 2 50 500 0.420000 0.003702
65 2 100 1000 0.410000 0.007414
66 1.5 250 1000 0.408000 0.006951
67 1.25 250 500 0.408000 0.003081
68 2 250 2500 0.408000 0.018561
69 1.25 50 100 0.400000 0.000618
70 3 50 1000 0.400000 0.007574
71 1.25 1000 2500 0.400000 0.016350
72 1.25 750 1500 0.397333 0.009340
73 1.5 500 2000 0.392000 0.013991
74 1.15 750 2500 0.392000 0.017142
75 1.25 500 1000 0.390000 0.006243
76 1.15 750 2000 0.389333 0.013272
77 1.15 750 1500 0.381333 0.009433
78 3 50 750 0.380000 0.005649
79 3 100 2000 0.380000 0.015169
80 1.15 1000 2500 0.376000 0.016537
81 1.5 750 2500 0.376000 0.017235
82 1.25 1000 2000 0.376000 0.012644
83 1.25 1250 2500 0.373600 0.015859
84 1.15 500 2500 0.372000 0.017946
85 1.15 500 2000 0.372000 0.014068
86 1.15 500 1500 0.372000 0.010191
87 2 250 2000 0.372000 0.014761
88 1.15 500 1250 0.370000 0.008259
89 1.5 250 750 0.368000 0.005093
90 1.15 1000 2000 0.368000 0.012706
91 1.15 750 1250 0.366667 0.007576
92 1.25 750 1250 0.365333 0.007584
93 1.15 1250 2500 0.364800 0.015945
94 1.15 250 2500 0.364000 0.018647
95 1.15 250 2000 0.364000 0.014776
96 1.15 250 1250 0.364000 0.008971
97 1.15 250 1000 0.364000 0.007036
98 1.15 250 1500 0.364000 0.010906
99 1.15 500 1000 0.362000 0.006352
100 2 100 750 0.360000 0.005520
101 1.15 250 750 0.356000 0.005116
102 1.15 1000 1500 0.353000 0.008930
103 1.15 1250 2000 0.352000 0.012169
104 1.25 500 750 0.352000 0.004452
105 1.5 500 1500 0.352000 0.010268
106 1.15 250 500 0.352000 0.003189
107 3 100 1500 0.350000 0.011326
108 3 100 1250 0.350000 0.009394
109 1.15 500 750 0.346000 0.004475
110 1.25 1250 2000 0.344000 0.012247
111 1.15 1500 2500 0.343333 0.015515
112 1.25 1500 2500 0.342667 0.015523
113 1.15 100 1250 0.340000 0.009401
114 1.15 100 2500 0.340000 0.019066
115 1.15 100 1500 0.340000 0.011334
116 1.15 100 2000 0.340000 0.015200
117 1.15 100 1000 0.340000 0.007469
118 1.15 100 750 0.340000 0.005536
119 1.15 100 500 0.340000 0.003603
120 1.15 100 250 0.340000 0.001670
121 1.15 750 1000 0.338667 0.005797
122 1.1 1000 2500 0.338000 0.016832
123 1.1 1000 2000 0.336000 0.012955
124 1.1 1250 2500 0.335200 0.016233
125 1.25 1000 1500 0.335000 0.009070
126 1.5 750 2000 0.332000 0.013606
127 1.5 1000 2500 0.331000 0.016887
128 1.1 1000 1500 0.330000 0.009109
129 1.1 1250 2000 0.328800 0.012395
130 2 250 1500 0.328000 0.010976
131 1.1 1500 2500 0.328000 0.015694
132 1.15 1000 1250 0.327000 0.007186
133 1.15 1500 2000 0.322667 0.011849
134 2 500 2500 0.322000 0.018140
135 1.15 50 1500 0.320000 0.011469
136 1.15 50 250 0.320000 0.001808
137 1.15 50 500 0.320000 0.003741
138 1.15 50 2000 0.320000 0.015333
139 1.15 50 2500 0.320000 0.019197
140 1.15 50 1000 0.320000 0.007605
141 1.15 1250 1500 0.320000 0.008581
142 1.15 50 1250 0.320000 0.009537
143 1.15 50 750 0.320000 0.005673
144 1.1 1250 1500 0.318400 0.008596
145 1.5 500 1250 0.318000 0.008461
146 1.1 1500 2000 0.318000 0.011904
147 1.1 1000 1250 0.317000 0.007264
148 1.25 750 1000 0.316000 0.005929
149 1.1 750 2500 0.314667 0.017592
150 1.1 750 1500 0.314667 0.009822
151 1.1 750 2000 0.314667 0.013707
152 1.15 2000 2500 0.313500 0.014697
153 1.1 750 1250 0.313333 0.007887
154 1.1 2000 2500 0.310000 0.014752
155 1.25 1500 2000 0.309333 0.012005
156 1.15 750 750 0.305333 0.004048
157 1.1 750 1000 0.304000 0.005999
158 1.25 1000 1250 0.302000 0.007381
159 2 250 1250 0.300000 0.009095
160 1.1 1500 1500 0.296667 0.008246
161 1.15 500 500 0.296000 0.002730
162 1.1 1250 1250 0.295200 0.006872
163 1.25 1250 1500 0.295200 0.008823
164 1.25 2000 2500 0.294000 0.015003
165 1.15 1000 1000 0.293000 0.005504
166 1.15 1250 1250 0.292800 0.006896
167 1.1 500 1500 0.292000 0.010501
168 1.1 500 2500 0.292000 0.018256
169 1.1 1000 1000 0.292000 0.005512
170 1.1 500 1000 0.292000 0.006623
171 1.1 500 1250 0.292000 0.008562
172 1.1 500 2000 0.292000 0.014378
173 1.1 2000 2000 0.290500 0.011134
174 1.1 500 750 0.288000 0.004700
175 1.15 1500 1500 0.287333 0.008355
176 1.1 2500 2500 0.286800 0.014046
177 1.5 1000 2000 0.286000 0.013344
178 1.5 750 1500 0.285333 0.009993
179 1.1 750 750 0.285333 0.004165
180 1.5 1250 2500 0.284800 0.016725
181 1.15 2000 2000 0.284000 0.011236
182 1.1 250 1500 0.284000 0.011061
183 1.1 250 2000 0.284000 0.014931
184 1.1 250 1000 0.284000 0.007191
185 1.1 250 2500 0.284000 0.018801
186 1.1 250 1250 0.284000 0.009126
187 1.1 250 750 0.284000 0.005256
188 1.1 250 500 0.284000 0.003321
189 1.15 2500 2500 0.280800 0.014164
190 1.5 100 250 0.280000 0.001716
191 1.15 50 100 0.280000 0.000665
192 1.15 250 250 0.280000 0.001393
193 2 500 2000 0.278000 0.014433
194 1.1 500 500 0.278000 0.002800
195 1.1 1500 1250 0.273333 0.006565
196 1.1 250 250 0.272000 0.001409
197 3 100 1000 0.270000 0.007523
198 1.25 500 500 0.268000 0.002838
199 1.5 500 1000 0.268000 0.006716
200 1.25 750 750 0.266667 0.004274
201 1.25 1250 1250 0.266400 0.007153
202 1.1 1250 1000 0.265600 0.005211
203 1.1 1000 750 0.264000 0.003784
204 1.25 250 250 0.264000 0.001424
205 1.25 1500 1500 0.260667 0.008668
206 1.1 2500 2000 0.260400 0.010627
207 1.15 100 100 0.260000 0.000572
208 2 100 500 0.260000 0.003665
209 2 50 250 0.260000 0.001832
210 3 250 2500 0.260000 0.018848
211 1.15 1500 1250 0.260000 0.006722
212 1.25 50 50 0.260000 0.000286
213 3 50 500 0.260000 0.003764
214 1.25 2000 2000 0.258000 0.011644
215 1.25 1000 1000 0.257000 0.005785
216 1.5 250 500 0.256000 0.003375
217 1.15 1000 750 0.255000 0.003854
218 1.1 2000 1500 0.254000 0.007784
219 1.25 2500 2500 0.252000 0.014731
220 1.15 1250 1000 0.252000 0.005344
221 1.5 750 1250 0.252000 0.008244
222 1.25 100 100 0.250000 0.000580
223 2 750 2500 0.249333 0.017973
224 1.15 2500 2000 0.248800 0.010855
225 1.5 1500 2500 0.244667 0.016671
226 1.15 2000 1500 0.244000 0.007941
227 1.5 1250 2000 0.243200 0.013230
228 2 250 1000 0.240000 0.007276
229 3 100 750 0.240000 0.005613
230 1.1 750 500 0.237333 0.002502
231 1.1 1500 1000 0.237333 0.005033
232 1.15 750 500 0.232000 0.002533
233 1.1 1250 750 0.231200 0.003596
234 3 250 2000 0.228000 0.015040
235 1.25 1500 1250 0.228000 0.007097
236 1.1 2000 1250 0.228000 0.006230
237 1.5 1000 1500 0.228000 0.009903
238 1.15 1500 1000 0.222667 0.005205
239 1.1 2500 1500 0.222400 0.007436
240 1.5 500 750 0.220000 0.004963
241 1.25 1250 1000 0.218400 0.005671
242 1.15 2000 1250 0.216500 0.006411
243 1.25 2500 2000 0.215200 0.011517
244 2 500 1500 0.214000 0.010803
245 1.25 1000 750 0.213000 0.004181
246 1.15 1250 750 0.212800 0.003776
247 1.1 100 1500 0.210000 0.011435
248 1.1 100 1000 0.210000 0.007569
249 1.1 100 2500 0.210000 0.019166
250 1.1 100 2000 0.210000 0.015300
251 1.25 2000 1500 0.210000 0.008474
252 1.1 100 1250 0.210000 0.009502
253 1.1 100 750 0.210000 0.005636
254 1.1 100 500 0.210000 0.003703
255 1.1 100 250 0.210000 0.001770
256 2 750 2000 0.206667 0.014336
257 1.15 2500 1500 0.206000 0.007759
258 1.5 1500 2000 0.204000 0.013240
259 1.1 1000 500 0.202000 0.002320
260 1.1 1500 750 0.201333 0.003502
261 1.5 750 1000 0.201333 0.006597
262 1.15 50 50 0.200000 0.000309
263 1.5 1000 1250 0.198000 0.008190
264 1.5 2000 2500 0.198000 0.016509
265 2 1000 2500 0.197000 0.017930
266 1.1 2500 1250 0.196800 0.005971
267 1.1 2000 1000 0.193500 0.004810
268 3 250 1500 0.192000 0.011239
269 1.1 500 250 0.192000 0.001194
270 2 250 750 0.192000 0.005434
271 1.25 750 500 0.190667 0.002774
272 1.5 1250 1500 0.189600 0.009852
273 1.25 1500 1000 0.186000 0.005635
274 1.15 1000 500 0.185000 0.002452
275 1.15 1500 750 0.184667 0.003697
276 2 500 1250 0.184000 0.008981
277 1.25 2000 1250 0.183500 0.006929
278 1.15 500 250 0.182000 0.001233
279 1.15 2500 1250 0.180400 0.006294
280 1.5 50 100 0.180000 0.000703
281 1.1 100 100 0.180000 0.000634
282 1.15 2000 1000 0.179000 0.005038
283 1.25 1250 750 0.177600 0.004119
284 1.25 2500 1500 0.172400 0.008421
285 1.1 1250 500 0.170400 0.002239
286 1.25 100 50 0.170000 0.000255
287 3 500 2500 0.170000 0.018729
288 3 250 1250 0.168000 0.009350
289 1.5 2000 2000 0.164000 0.013120
290 1.5 2500 2500 0.164000 0.016464
291 2 1250 2500 0.164000 0.017903
292 1.5 1250 1250 0.163200 0.008160
293 1.1 2500 1000 0.162400 0.004679
294 1.1 2000 750 0.161500 0.003351
295 1.15 100 50 0.160000 0.000263
296 2 1000 2000 0.160000 0.014325
297 1.5 750 750 0.158667 0.004903
298 1.5 1500 1500 0.158667 0.009864
299 1.5 1000 1000 0.156000 0.006571
300 1.25 500 250 0.156000 0.001334
301 2 750 1500 0.153333 0.010762
302 1.1 250 100 0.152000 0.000480
303 1.15 1250 500 0.152000 0.002418
304 2 100 250 0.150000 0.001817
305 1.25 1500 750 0.150000 0.004103
306 1.25 2500 1250 0.149200 0.006909
307 1.5 500 500 0.148000 0.003304
308 1.25 1000 500 0.148000 0.002741
309 1.15 2500 1000 0.146800 0.004986
310 1.1 1500 500 0.146667 0.002188
311 1.1 750 250 0.146667 0.001088
312 1.25 2000 1000 0.146500 0.005548
313 1.15 2000 750 0.145500 0.003602
314 1.5 250 250 0.140000 0.001664
315 3 50 250 0.140000 0.001878
316 1.1 50 50 0.140000 0.000332
317 1.5 50 50 0.140000 0.000332
318 1.1 50 1500 0.140000 0.011538
319 1.1 50 2500 0.140000 0.019267
320 1.1 50 500 0.140000 0.003810
321 1.1 50 750 0.140000 0.005742
322 1.1 50 1000 0.140000 0.007674
323 1.1 50 1250 0.140000 0.009606
324 1.1 50 2000 0.140000 0.015403
325 1.1 50 250 0.140000 0.001878
326 1.1 50 100 0.140000 0.000719
327 2 1500 2500 0.138667 0.017914
328 3 500 2000 0.138000 0.014976
329 2 500 1000 0.136000 0.007228
330 1.15 250 100 0.136000 0.000511
331 1.5 1500 1250 0.136000 0.008176
332 1.1 2500 750 0.132800 0.003293
333 1.5 2500 2000 0.132800 0.013140
334 2 1250 2000 0.132000 0.014314
335 1.15 750 250 0.130667 0.001181
336 1.15 1500 500 0.130000 0.002384
337 3 100 500 0.130000 0.003765
338 2 750 1250 0.129333 0.008959
339 1.5 1250 1000 0.127200 0.006560
340 1.5 2000 1500 0.124000 0.009824
341 1.5 1000 750 0.122000 0.004889
342 1.1 100 50 0.120000 0.000294
343 3 250 1000 0.120000 0.007508
344 1.25 1250 500 0.120000 0.002730
345 1.25 2500 1000 0.118800 0.005538
346 1.15 2500 750 0.118400 0.003576
347 2 1000 1500 0.118000 0.010760
348 1.1 1000 250 0.117000 0.001035
349 2 250 500 0.116000 0.003646
350 1.25 2000 750 0.116000 0.004065
351 1.1 2000 500 0.114000 0.002134
352 3 750 2500 0.113333 0.018766
353 1.25 250 100 0.112000 0.000557
354 2 1500 2000 0.110667 0.014335
355 1.5 100 100 0.110000 0.000688
356 2 2000 2500 0.106500 0.017945
357 1.5 1500 1000 0.106000 0.006573
358 1.25 750 250 0.104000 0.001337
359 3 500 1500 0.104000 0.011230
360 1.5 750 500 0.104000 0.003279
361 1.5 2000 1250 0.103500 0.008184
362 1.25 1500 500 0.100667 0.002728
363 1.5 2500 1500 0.099600 0.009855
364 1.15 2000 500 0.099500 0.002362
365 1.5 1250 750 0.098400 0.004891
366 2 500 750 0.098000 0.005437
367 1.15 1000 250 0.098000 0.001183
368 2 1000 1250 0.098000 0.008969
369 2 1250 1500 0.096000 0.010765
370 3 250 750 0.096000 0.005619
371 1.1 1250 250 0.093600 0.001037
372 2 750 1000 0.093333 0.007227
373 1.25 2500 750 0.092800 0.004081
374 1.1 2500 500 0.092400 0.002119
375 3 750 2000 0.092000 0.015005
376 3 500 1250 0.086000 0.009361
377 2 2500 2500 0.085200 0.018016
378 3 1000 2500 0.085000 0.018802
379 1.1 500 100 0.084000 0.000450
380 2 2000 2000 0.083500 0.014383
381 1.5 2500 1250 0.082800 0.008216
382 1.5 1500 750 0.082000 0.004901
383 1.15 250 50 0.080000 0.000232
384 1.15 2500 500 0.080000 0.002363
385 2 1500 1500 0.080000 0.010786
386 1.5 2000 1000 0.079500 0.006599
387 1.1 1500 250 0.079333 0.001024
388 2 1250 1250 0.079200 0.008979
389 1.15 1250 250 0.079200 0.001178
390 1.5 1000 500 0.079000 0.003278
391 1.25 1000 250 0.079000 0.001331
392 1.25 2000 500 0.076500 0.002723
393 1.1 250 50 0.076000 0.000240
394 1.15 500 100 0.074000 0.000489
395 3 100 250 0.070000 0.001879
396 1.5 500 250 0.070000 0.001667
397 1.5 100 50 0.070000 0.000332
398 2 1000 1000 0.070000 0.007241
399 3 750 1500 0.069333 0.011252
400 3 1000 2000 0.069000 0.015034
401 1.25 250 50 0.068000 0.000255
402 3 1250 2500 0.068000 0.018839
403 2 2500 2000 0.066800 0.014440
404 1.15 1500 250 0.066667 0.001172
405 2 750 750 0.066667 0.005439
406 2 1500 1250 0.066000 0.008996
407 1.5 2500 1000 0.063600 0.006625
408 1.5 1250 500 0.063200 0.003284
409 1.25 1250 250 0.063200 0.001334
410 1.5 2000 750 0.061500 0.004920
411 1.25 2500 500 0.061200 0.002734
412 1.1 2000 250 0.060500 0.001012
413 3 50 100 0.060000 0.000750
414 3 50 50 0.060000 0.000363
415 2 50 100 0.060000 0.000750
416 3 500 1000 0.060000 0.007523
417 2 50 50 0.060000 0.000363
418 2 2000 1500 0.060000 0.010828
419 2 250 250 0.060000 0.001819
420 1.1 750 100 0.058667 0.000435
421 2 500 500 0.058000 0.003653
422 3 750 1250 0.057333 0.009379
423 3 1500 2500 0.056667 0.018876
424 2 1250 1000 0.056000 0.007255
425 1.25 500 100 0.056000 0.000558
426 3 1250 2000 0.055200 0.015063
427 1.5 1500 500 0.052667 0.003291
428 1.25 1500 250 0.052667 0.001337
429 3 1000 1500 0.052000 0.011273
430 3 250 500 0.052000 0.003770
431 1.15 2000 250 0.050000 0.001177
432 2 1000 750 0.050000 0.005450
433 2 2000 1250 0.049500 0.009031
434 1.15 750 100 0.049333 0.000490
435 1.5 2500 750 0.049200 0.004939
436 1.1 2500 250 0.048400 0.001016
437 2 2500 1500 0.048000 0.010871
438 3 500 750 0.048000 0.005630
439 1.5 750 250 0.046667 0.001671
440 2 1500 1000 0.046667 0.007269
441 3 1500 2000 0.046000 0.015093
442 1.5 250 100 0.044000 0.000689
443 1.1 1000 100 0.044000 0.000436
444 3 1000 1250 0.043000 0.009397
445 3 2000 2500 0.042500 0.018950
446 3 1250 1500 0.041600 0.011295
447 1.15 2500 250 0.040000 0.001182
448 1.15 500 50 0.040000 0.000233
449 3 750 1000 0.040000 0.007537
450 2 1250 750 0.040000 0.005461
451 1.1 500 50 0.040000 0.000233
452 2 2500 1250 0.039600 0.009067
453 1.5 2000 500 0.039500 0.003303
454 1.25 2000 250 0.039500 0.001342
455 2 750 500 0.038667 0.003660
456 1.25 750 100 0.037333 0.000559
457 1.15 1000 100 0.037000 0.000490
458 1.1 1250 100 0.035200 0.000437
459 2 2000 1000 0.035000 0.007297
460 1.5 1000 250 0.035000 0.001674
461 3 1500 1500 0.034667 0.011318
462 3 2000 2000 0.034500 0.015152
463 3 1250 1250 0.034400 0.009415
464 3 2500 2500 0.034000 0.019024
465 1.25 500 50 0.034000 0.000256
466 2 1500 750 0.033333 0.005471
467 3 750 750 0.032000 0.005641
468 1.25 2500 250 0.031600 0.001347
469 1.5 2500 500 0.031600 0.003316
470 2 100 100 0.030000 0.000750
471 2 100 50 0.030000 0.000363
472 3 100 50 0.030000 0.000363
473 3 100 100 0.030000 0.000750
474 2 500 250 0.030000 0.001823
475 3 1000 1000 0.030000 0.007552
476 1.15 1250 100 0.029600 0.000491
477 1.1 1500 100 0.029333 0.000438
478 2 1000 500 0.029000 0.003667
479 3 1500 1250 0.028667 0.009434
480 1.5 250 50 0.028000 0.000333
481 2 2500 1000 0.028000 0.007326
482 3 250 250 0.028000 0.001881
483 1.5 1250 250 0.028000 0.001677
484 1.25 1000 100 0.028000 0.000561
485 3 2500 2000 0.027600 0.015212
486 1.15 750 50 0.026667 0.000233
487 1.1 750 50 0.026667 0.000233
488 3 500 500 0.026000 0.003777
489 3 2000 1500 0.026000 0.011362
490 2 2000 750 0.025000 0.005493
491 1.15 1500 100 0.024667 0.000492
492 3 1000 750 0.024000 0.005652
493 3 1250 1000 0.024000 0.007567
494 1.5 1500 250 0.023333 0.001680
495 2 1250 500 0.023200 0.003674
496 1.25 750 50 0.022667 0.000256
497 1.25 1250 100 0.022400 0.000562
498 1.1 2000 100 0.022000 0.000439
499 1.5 500 100 0.022000 0.000690
500 3 2000 1250 0.021500 0.009471
501 3 2500 1500 0.020800 0.011407
502 2 2500 750 0.020000 0.005514
503 1.15 1000 50 0.020000 0.000234
504 3 1500 1000 0.020000 0.007582
505 1.1 1000 50 0.020000 0.000234
506 2 750 250 0.020000 0.001826
507 2 1500 500 0.019333 0.003681
508 3 1250 750 0.019200 0.005663
509 1.25 1500 100 0.018667 0.000563
510 1.15 2000 100 0.018500 0.000494
511 1.1 2500 100 0.017600 0.000441
512 1.5 2000 250 0.017500 0.001687
513 3 750 500 0.017333 0.003784
514 3 2500 1250 0.017200 0.009508
515 1.25 1000 50 0.017000 0.000257
516 3 1500 750 0.016000 0.005674
517 1.15 1250 50 0.016000 0.000234
518 1.1 1250 50 0.016000 0.000234
519 2 1000 250 0.015000 0.001830
520 3 2000 1000 0.015000 0.007611
521 1.15 2500 100 0.014800 0.000496
522 1.5 750 100 0.014667 0.000692
523 2 2000 500 0.014500 0.003696
524 1.5 500 50 0.014000 0.000333
525 1.25 2000 100 0.014000 0.000565
526 3 500 250 0.014000 0.001885
527 1.5 2500 250 0.014000 0.001694
528 1.25 1250 50 0.013600 0.000257
529 1.15 1500 50 0.013333 0.000234
530 1.1 1500 50 0.013333 0.000234
531 3 1000 500 0.013000 0.003792
532 2 250 50 0.012000 0.000364
533 3 2000 750 0.012000 0.005697
534 3 250 50 0.012000 0.000364
535 3 250 100 0.012000 0.000751
536 2 250 100 0.012000 0.000751
537 2 1250 250 0.012000 0.001833
538 3 2500 1000 0.012000 0.007641
539 2 2500 500 0.011600 0.003710
540 1.25 1500 50 0.011333 0.000258
541 1.25 2500 100 0.011200 0.000567
542 1.5 1000 100 0.011000 0.000693
543 3 1250 500 0.010400 0.003799
544 2 1500 250 0.010000 0.001837
545 1.15 2000 50 0.010000 0.000235
546 1.1 2000 50 0.010000 0.000235
547 3 2500 750 0.009600 0.005719
548 3 750 250 0.009333 0.001888
549 1.5 750 50 0.009333 0.000334
550 1.5 1250 100 0.008800 0.000694
551 3 1500 500 0.008667 0.003806
552 1.25 2000 50 0.008500 0.000259
553 1.1 2500 50 0.008000 0.000236
554 1.15 2500 50 0.008000 0.000236
555 2 2000 250 0.007500 0.001844
556 1.5 1500 100 0.007333 0.000696
557 1.5 1000 50 0.007000 0.000335
558 3 1000 250 0.007000 0.001892
559 1.25 2500 50 0.006800 0.000260
560 3 2000 500 0.006500 0.003821
561 2 500 100 0.006000 0.000752
562 2 500 50 0.006000 0.000365
563 3 500 50 0.006000 0.000365
564 3 500 100 0.006000 0.000752
565 2 2500 250 0.006000 0.001851
566 3 1250 250 0.005600 0.001896
567 1.5 1250 50 0.005600 0.000335
568 1.5 2000 100 0.005500 0.000698
569 3 2500 500 0.005200 0.003836
570 3 1500 250 0.004667 0.001899
571 1.5 1500 50 0.004667 0.000336
572 1.5 2500 100 0.004400 0.000701
573 2 750 100 0.004000 0.000754
574 3 750 50 0.004000 0.000365
575 2 750 50 0.004000 0.000365
576 3 750 100 0.004000 0.000754
577 1.5 2000 50 0.003500 0.000337
578 3 2000 250 0.003500 0.001907
579 3 1000 100 0.003000 0.000755
580 3 1000 50 0.003000 0.000366
581 2 1000 50 0.003000 0.000366
582 2 1000 100 0.003000 0.000755
583 1.5 2500 50 0.002800 0.000339
584 3 2500 250 0.002800 0.001914
585 2 1250 100 0.002400 0.000757
586 3 1250 50 0.002400 0.000367
587 3 1250 100 0.002400 0.000757
588 2 1250 50 0.002400 0.000367
589 3 1500 100 0.002000 0.000758
590 2 1500 50 0.002000 0.000367
591 3 1500 50 0.002000 0.000367
592 2 1500 100 0.002000 0.000758
593 3 2000 100 0.001500 0.000761
594 3 2000 50 0.001500 0.000369
595 2 2000 100 0.001500 0.000761
596 2 2000 50 0.001500 0.000369
597 3 2500 50 0.001200 0.000370
598 3 2500 100 0.001200 0.000764
599 2 2500 100 0.001200 0.000764
600 2 2500 50 0.001200 0.000370

Regression analysis with shorter time period

Table OB.2

# term coef_3m term coef_6m term coef_12m
1 musk 0.018329 musk 0.006199 2020 0.002833
2 unsworth 0.014077 polit 0.006183 pro 0.002459
3 5g 0.010579 2020 0.005310 politics 0.001707
4 aws 0.009798 unsworth 0.004340 iphone 11 0.001518
5 huawei 0.009609 election 0.003460 libra 0.001507
6 amazon 0.009136 pro 0.003374 11 0.001384
7 2020 0.009077 aws 0.002569 climate 0.001318
8 2019 0.008944 ring 0.002399 unsworth 0.001208
9 ai 0.008309 breton 0.002101 tiktok 0.001113
10 ring 0.008101 climate 0.001866 johnson 0.001073
11 trade 0.007341 airpod 0.001817 ring 0.001029
12 cave 0.005278 political ad 0.001673 leyen 0.001004
13 johnson 0.004499 nato 0.001642 von 0.000995
14 youtube 0.004151 skydio 0.001612 der 0.000984
15 worker 0.004129 cave 0.001597 musk 0.000902
16 extinct 0.003978 black friday 0.001562 pixel 4 0.000842
17 climate 0.003925 extinct 0.001470 breton 0.000820
18 emissions 0.003893 politician 0.001364 tip 0.000803
19 2018 0.003653 boe 0.001305 pixel 0.000785
20 rescue 0.003589 democracy 0.001250 wework 0.000769
21 pedo 0.003552 tiktok 0.001238 aws 0.000751
22 insult 0.003434 pixel 0.001233 facial 0.000733
23 y2k 0.003401 2019 0.001225 earbud 0.000728
24 violate 0.003253 airpod pro 0.001221 sponsor 0.000690
25 pedo guy 0.003223 bias 0.001201 kong 0.000684
26 boe 0.003197 conservative 0.001146 von der 0.000671
27 brin 0.003037 pedo 0.001140 hong kong 0.000667
28 starlink 0.003007 vape 0.001127 vape 0.000667
29 jassi 0.002846 false 0.001124 hong 0.000667
30 tariff 0.002712 cybertruck 0.001111 conservative 0.000654
# term coef_3m term coef_6m term coef_12m

Weighting methods comparison: main method vs 531 method

Table OB.3

Terms among top100 trending in main method, not top100 trending in 531 method Terms among top100 trending in 531 method, not top100 trending in main method
alleged 13
cybersecurity amid
democracy democrat
fake der
orbit president donald
washington supply
Terms among top100 trending in main method, not top100 trending in 531 method Terms among top100 trending in 531 method, not top100 trending in main method

Weighting methods comparison: main vs equal source weight

Table OB.4

This research is based on multiple implicit underlying assumptions, for example that all words in an article are equally important and that the assigned weights correctly represent the importance of a source. We recalculated all results changing one of the assumptions, but the final results do not change significantly regardless of the method. Out of top 100 most growing terms in the base methodology, 95 and 86 occur in top 250 most growing terms after first and second change respectively. For top 100 most growing terms, the values are 94 and 79. More weight has been given to title and the first paragraph in coefficient calculations (title weight: 5, first paragraph: weight 3). Terms which are more salient and important are assumed to be at the beginning of article. It may be misleading if title and the first paragraph are “clickbait” – there is risk of capturing divisive and eye-catching words, which are not emerging technologies. The changes may be desirable – e.g. "alleged" disappears from the growing list, mostly because vague terms are not used to catch attention, however, words like “cybersecurity” and “fake” disappear too, even though they may warrant further investigation. Terms which appear in the method with increased weight of title and the first paragraph seem random or irrelevant, which is why we consider this method inferior.

Terms among top100 trending in main method, not top250 trending in equal source weight Terms among top100 trending in equal source weight, not top250 trending in main method
alleged 13
election gizmodo
euractiv policy
eus subscript
facebook the verge
fake u.
johnson verge
leyen
macron
orbit
practice
satellite
von
washington
Terms among top100 trending in main method, not top250 trending in equal source weight Terms among top100 trending in equal source weight, not top250 trending in main method

Terms sorted by average annual growth rate among terms occurring in every year

Table OB.5

The weights of sources have been removed, so that all articles are equally important. In spite of Fastcompany and ZDNet being similarly popular according to Alexa.com rank, ZDNet has much more influence on final results, as ZDNet has over four times as many articles as Fastcompany. There have been more changes in trending keywords than in the first robustness check (for clarity we present here only terms which are outside top 250 in the alternative method), but the additional keywords do not seem relevant, while we have lost a great deal of terms related to politics. As the goal of this research is to discover trends in a wide variety of sources and find emerging technologies' importance for social issues, we have chosen the methodology of weighting by source.

term coef coef_norm agr2019 agr2018 agr2017 average_agr
iphone x 3.773479e-05 0.012196 -0.842303 -0.131120 2883.292585 960.773054
hutchin 3.984871e-06 0.007752 2.741714 -0.898693 1344.429746 448.757589
pixel 3 8.026218e-05 0.064740 -0.325786 1131.843046 -0.821212 376.898683
face id 2.487465e-05 0.023622 -0.382949 -0.386081 986.816231 328.682400
mariya 1.988204e-05 0.030744 -0.028226 0.006485 782.867478 260.948579
arkit -2.754389e-07 -0.000503 -0.697503 -0.575862 658.202698 218.976444
equifax 1.719139e-05 0.008932 -0.425989 -0.636173 549.361820 182.766552
irma -1.845073e-06 -0.007996 -0.547843 -0.883616 486.595921 161.721487
iphone xs 9.930947e-05 0.053352 -0.611201 9.568392 450.684946 153.214046
khosrowshahi 2.579912e-05 0.023241 -0.325138 -0.251657 447.779250 149.067485
bixbi 5.271238e-06 0.004580 -0.218004 -0.555780 426.065306 141.763840
p20 1.608414e-05 0.028622 -0.921307 424.734828 -0.593737 141.073261
charlottesville 7.811297e-07 0.001720 -0.123647 -0.802816 418.057618 139.043718
series 4 1.899206e-05 0.059698 -0.623730 406.851055 -0.482550 135.248258
upguard 4.331406e-06 0.021778 0.135568 -0.428277 384.084462 127.930584
pixel 2 5.326657e-06 0.003965 -0.843873 -0.576335 370.277936 122.952576
s8s -3.960209e-06 -0.033017 -0.897353 -0.881286 271.251734 89.824365
animoji 3.349359e-06 0.011175 -0.835748 -0.018336 260.114561 86.420159
unsworth 8.913366e-05 0.097341 1.609779 245.965378 -0.518268 82.352296
barnier -1.540615e-05 -0.021262 -0.493581 -0.910477 248.095306 82.230416
term coef coef_norm agr2019 agr2018 agr2017 average_agr

Terms sorted by coefficient among terms with coef_norm >0.0125

Table OB.6

As a further robustness check, we computed the growth rates between years. Unfortunately, this method is not well suited to technological terms and social movements, as they are likely not to occur at all in one year and then explode in popularity in another. Regardless of precise values, they both will have the same growth rate (infinity). Moreover, the method does not seem to be robust to changes in starting month. Table below shows the terms with the highest annual average growth rate. Annual growth rates could be useful to show the growth of multiple trending (by linear coefficient) terms, and identify the period when the term was most trending. Is it a robust technology with slowing growth rate (AI, blockchain), or an issue which grows in importance year after year (climate)? We show the results in the table below, but the charts in the main part of the article are sufficient for our purposes.

term coef coef_norm agr2019 agr2018 agr2017 average_agr
huawei 0.001498 0.057430 2.158864 1.815315 0.069576 1.347918
facebook 0.001464 0.012844 -0.186142 0.755821 0.027630 0.199103
5g 0.001329 0.048283 0.977687 1.200978 0.741707 0.973457
2019 0.001160 0.068162 2.011041 2.261627 1.725850 1.999506
2018 0.001097 0.040546 -0.239154 2.747621 1.865743 1.458070
ai 0.000921 0.022269 -0.048961 0.432463 0.909434 0.430979
amazon 0.000766 0.013325 -0.023231 0.212401 0.379094 0.189421
china 0.000695 0.020133 0.337478 0.512756 -0.022784 0.275817
chinese 0.000648 0.029222 0.537753 0.854267 -0.047055 0.448322
politics 0.000516 0.017514 0.158935 0.211086 0.360713 0.243578
pro 0.000479 0.026197 0.492058 0.327931 0.159542 0.326510
2020 0.000363 0.032805 1.558757 0.122225 -0.070290 0.536897
sponsor 0.000354 0.043592 1.261489 1.454900 -0.061962 0.884809
climate 0.000345 0.027572 0.461446 0.417785 0.176166 0.351799
ban 0.000340 0.020179 0.251018 0.201336 0.556141 0.336165
trade 0.000338 0.013338 0.199458 0.033552 0.427002 0.220004
practice 0.000312 0.012746 0.361783 0.133978 0.037501 0.177754
blockchain 0.000307 0.019211 -0.314881 0.448025 1.671960 0.601701
elect 0.000307 0.013564 -0.097395 0.186902 0.493119 0.194209
facial 0.000301 0.040800 1.134429 0.202018 0.964245 0.766897
term coef coef_norm agr2019 agr2018 agr2017 average_agr

Highest scaled exponential regression coefficients with scaling parameter 1/2

Table OB.7

Exponential regressions may better suit the dynamics of term frequencies. However, the results show that the baseline method is better in identifying relevant trending terms. As logarithm is the inverse of the exponential function, running an OLS regression on logarithms is equivalent to exponential regression. Zero values pose a problem, as logarithm of 0 does not exist – they are discarded. In the regression, the explanatory variable is the number of the month (between 1 and 48, inclusive), and the response variable is the logarithm of the term frequency for the corresponding month. We get two parameter values from the regression – constant and β. The β parameter is highest for terms with very high values in the last analysed month, low in the second-to-last month, and zero in all previous months. Consequently, normalization is required – we multiply the β parameter by mean frequency to the power of a scaling parameter:
exp_coef_norm=βmean_frequencyscaling_parameter
If the scaling parameter is 1, top results are similar to our results from OLS, only the order changes: 2018, 5g, huawei, facebook, and 2019 are five most trending terms (5th, 3rd, 1st, 2nd and 4th highest OLS coefficients respectively). The lower the scaling parameter, the more specific the results get. With the value of 2/3, pixel 4 enters the top 5, and is in the top 3 with scaling parameter 1/2. Some most trending terms from linear regression are still maintained in exponential regression with scaling parameter 1/2.

term exp_coef_norm
politico politico 0.017485
break scoop 0.017485
pixel 4 0.014922
unsworth attorney 0.013779
2019 0.012156
7t 0.011670
2018 0.010713
huawei 0.010284
5g 0.010110
iphon 11 0.009777
danuvius 0.008991
intensive hunt 0.008803
tiktok 0.008367
oneplus 7t 0.008361
nuflare 0.008087
term exp_coef_norm

Highest scaled exponential regression coefficients with scaling parameter 1/3

Table OB.8

term exp_coef_norm
politico politico 0.095853
break scoop 0.095853
unsworth attorney 0.085915
intensive hunt 0.057180
nuflar 0.055977
pixel 4 0.049320
danuvius 0.048302
virgil griffith 0.046760
7t 0.044681
remrise 0.041782
chrome 79 0.041114
unified dream 0.041105
jim chilton 0.036881
oneplus 7t 0.034870
investiture 0.034650
term exp_coef_norm

Calculated co-occurrences for blockchain technology

Table OC.1

# blockchain technology blockchain technology_cooc
1 cryptocurrency 196.845597
2 blockchain technology 100.000000
3 crypto 46.768246
4 digital currency 40.098701
5 libra 34.603120
6 china 27.762217
7 blockchainbased 22.699050
8 sec 22.516698
9 2018 22.063878
10 carlson 19.166667
11 decentralisation 18.965503
12 ai 16.335167
13 deepfake 15.004559
14 ban 14.826365
15 supply chain 11.437814
16 chinese 10.549503
17 consent 8.522266
18 sanction 8.238398
19 zaslavskiy 7.738095
20 2020 7.326500
21 stablecoin 7.221113
22 long island 6.178832
23 voatz 6.130320
24 puigneró 5.833333
25 cryptocurrency exchange 5.310962
26 ecb 5.259165
27 coinbase 5.233932
28 2019 5.196297
29 griffith 5.136423
30 democracy 5.104822
# blockchain technology blockchain technology_cooc

Calculated co-occurrences for surveillance capitalism

Table OC.2

# surveillance capitalism surveillance capitalism_cooc
1 surveillance capitalism 95.000000
2 sovereignty 70.801688
3 ai 54.022208
4 digital sovereignty 50.000000
5 gdpr 49.736234
6 tech giant 48.940000
7 zuboff 47.088442
8 euractiv 30.000000
9 yuval noah 28.175325
10 2018 27.960519
11 waterfront 26.464286
12 altmaier 25.000000
13 lawmaker 22.785195
14 waterfron _toronto 22.642857
15 democraci 22.484935
16 china 20.940519
17 shoshana 20.136753
18 bria 20.000000
19 ja 19.285714
20 shoshana zuboff 19.130519
21 sponsor 16.960519
22 der 15.444545
23 5g 15.295455
24 scandal 15.265455
25 leyen 15.204545
26 von 15.204545
27 soli 15.000000
28 edpb 15.000000
29 analytica 14.880260
30 cambridge 14.864026
# surveillance capitalism surveillance capitalism_cooc

Calculated co-occurrences for misinformation

Table OC.3

# misinformation misinformation_cooc
1 misinformation 100.000000
2 ai 42.940641
3 vaccine 40.227939
4 deepfake 35.221418
5 ban 26.475820
6 factcheck 25.567188
7 disinformation 24.569195
8 2018 24.355350
9 cambridge 16.706713
10 conspiracy 16.041673
11 democracy 15.134445
12 analytica 14.750945
13 china 14.002938
14 cambridge analytica 13.665370
15 climate 13.635736
16 measles 13.202411
17 5g 13.186447
18 quantum computing 12.965406
19 political ad 12.872236
20 2019 12.527644
21 lawmaker 12.243960
22 chinese 10.539452
23 bias 9.874223
24 scandal 9.357493
25 conspiracytheory 8.175796
26 2020 7.970153
27 antivaccine 7.965322
28 protest 7.886655
29 spread misinformation 7.770613
30 fake account 7.389236
# misinformation misinformation_cooc

Calculated co-occurrences for chinese telecom

Table OC.4

# chinese telecom chinese telecom_cooc
1 huawei 800.207708
2 chinese 402.108071
3 china 271.979774
4 5g 248.827971
5 ban 164.811744
6 zte 140.522809
7 chinese telecom 95.000000
8 meng 69.127429
9 5g network 61.133997
10 sanction 53.617221
11 chinese government 50.771921
12 ren 40.319427
13 ally 35.766376
14 trump administration 32.293809
15 lawmaker 31.379291
16 espionage 31.310811
17 iran 30.910163
18 beijing 30.430263
19 2018 28.266280
20 tariff 27.258771
21 wanzhou 23.545780
22 communist 20.524686
23 extradition 20.103992
24 huawei equipment 19.867654
25 zhengfei 19.413280
26 2019 18.858124
27 us sanction 18.168436
28 commercial departure 17.792035
29 trade war 16.965370
30 sen 16.851800
# chinese telecom chinese telecom_cooc

Calculated co-occurrences for gdpr

Table OC.5

# gdpr gdpr_cooc
1 gdpr 100.000000
2 consent 33.370909
3 ai 33.169374
4 2018 20.249770
5 cambridge 17.118728
6 analytica 16.116974
7 recognition 15.262690
8 protection regulation 14.821126
9 cambridge analytica 14.673502
10 newsom 13.081628
11 scandal 10.513022
12 china 10.289559
13 facial 9.334615
14 ethic 9.005351
15 ftc 8.526061
16 facial recognition 8.376526
17 lawmaker 8.073461
18 2019 6.375928
19 tech giant 5.755937
20 chinese 5.696861
21 ban 5.400281
22 nacho 5.270270
23 2020 4.605429
24 misuse 4.356815
25 huawei 4.294887
26 information commission 3.993150
27 democracy 3.747560
28 privacy legislation 3.576361
29 regulation gdpr 3.547730
30 whois 3.434968
# gdpr gdpr_cooc

Calculated co-occurrences for facebook

Table OC.6

# facebook facebook_cooc
1 ai 13.407766
2 cambridg 5.434461
3 2018 5.193653
4 analytica 5.051801
5 cambridg_analytica 4.709706
6 china 4.272199
7 ban 3.972702
8 gdpr 3.110692
9 scandal 2.743519
10 recognit 2.531495
11 chines 2.508965
12 tech_giant 2.462065
13 2019 2.457149
14 disinform 2.371973
15 consent 2.326978
16 lawmak 2.225382
17 cryptocurr 2.007102
18 libra 1.930945
19 misinform 1.896684
20 democraci 1.812156
21 5g 1.798370
22 ethic 1.791328
23 bias 1.698170
24 pro 1.688553
25 2020 1.602577
26 sponsor 1.588879
27 huawei 1.574024
28 protest 1.553039
29 facial 1.543015
30 tip 1.539259
# facebook facebook_cooc

Calculated co-occurrences for facial recognition

Table OC.7

# facial_recognit facial_recognit_cooc
1 recognit 123.622496
2 facial 115.143097
3 facial_recognit 100.000000
4 ai 64.809141
5 china 22.968115
6 bias 14.448999
7 chines 13.927527
8 ethic 13.403555
9 2018 12.114826
10 pro 11.359893
11 huawei 11.289938
12 face_recognit 11.242109
13 ban 10.890462
14 face_id 8.955656
15 consent 8.566800
16 rekognit 7.928256
17 ring 7.343297
18 5g 6.576523
19 2019 6.539149
20 gdpr 6.422862
21 euractiv 6.117647
22 xs 4.925280
23 protest 4.696483
24 lawmak 4.611668
25 biometr_data 4.439124
26 2020 4.437249
27 oneplus 4.304753
28 afr 4.295123
29 max 3.944335
30 doorbel 3.834518
# facial_recognit facial_recognit_cooc

Most positive co-occurrences for gdpr

Table OD.1

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
term gdpr gdpr count
3650.210765356
migration0.1965342762
public\_cloud0.1919742851
ccpa0.189392708
fcc0.177942341
voice\_assistant0.175058832
ai0.17439111779
california\_consumer0.1736491279
climate0.1696721214
face\_recognition0.1690011490
newsome0.165271105
870.1623332
5g0.1588672787
surveillance\_capitalism0.1567811189
mozilla0.1566811343
risk\_assessment0.156673919
recognition0.1540681851
ring0.1537681545
gavin0.153422262
data\_portability0.1529151384
term gdpr gdpr count

Most negative co-occurrences for gdpr

Table OD.2

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
term gdpr gdpr count
ftc0.08646612598
federal\_trade0.08608023142
wyden0.0854798321
sanctions0.08195983282
analytica0.0813916885
misuse0.07932126936
do\_not0.0782476601
amid0.0771283132
chu0.0765727110
ban0.07520455908
irish\_data0.07287152716
disinformation0.06189023140
87\_million0.06019232615
protest0.05971812434
dixon0.0592005789
tiktok0.0454051897
noyb0.04346721002
dpc0.04072271346
marriott0.0342517511
whois-0.03467051040
term gdpr gdpr count

Most positive co-occurrences for huawei

Table OD.3

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
term huawei huawei count
telephoto0.2814491765
ondevice\_ai0.271619761
pixel\_40.2598931078
6gb0.2587974826
pixel\_3a0.2587191561
oneplus\_70.2582352258
wireless\_charging0.2529318957
note\_90.2511343118
earbuds0.2507871308
zoom0.2477447040
mate\_200.2468072702
xs0.2442851449
oneplus0.2417798495
8gb0.2415415215
128gb0.2408857173
google\_assistant0.2408334735
pixel\_30.2408034837
3a0.240416389
notch0.2396748694
galaxy\_s100.2361463744
term huawei huawei count

Most negative co-occurrences for huawei

Table OD.4

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
term huawei huawei count
ban\_huawei0.02701045265
telecom\_equipment0.02310646592
president\_donald0.02216119756
trade\_war0.02127448214
lawmakers0.01962565608
huawei\_equipment0.015108510210
chinese\_government0.011887713765
chinese\_telecom0.008563498393
trump\_administration0.008539899358
ren0.00624312007
espionage0.002117096687
blacklist-0.0002019924719
hong-0.0226732436
zhengfei-0.02870794131
commerce\_department-0.03073224774
sanctions-0.03289316135
iran-0.05412977237
extradite-0.07613683318
meng-0.07824584391
wanzhou-0.08031163305
term huawei huawei count

Most positive co-occurrences for facebook

Table OD.5

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
term facebook facebook count
macos0.1890018879
tencent0.1851388989
softbank0.1841935852
read\_also0.1820766661
podcast0.1756919697
ai0.1751283437
kong0.174563939
pro0.17265519620
next\_generation0.17097618418
recognition0.16882714035
3650.168662164
sponsor0.16642118942
migration0.16507614354
github0.1607029798
max0.1584789075
5g0.15588611014
bezos0.1522916784
crypto0.151769861
870.1455711654
facial0.1439035252
term facebook facebook count

Most negative co-occurrences for facebook

Table OD.6

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
term facebook facebook count
misinformation0.042273736951
fake\_account0.041880914934
vaccine0.04157846091
factcheck0.041048915639
iran0.04083738832
protest0.038787333626
kogan0.03558154776
deepfake0.03468935066
spokesperson\_told0.034292915725
president\_donald0.034003220493
disinformation0.032379723635
ban0.030315872519
scl0.0275033089
president\_trump0.024727216907
rightwing0.010440710223
content\_moderation0.0046752115109
conspiracy-0.0034507612040
myanmar-0.01581488267
farright-0.021693810140
infowars-0.0505047079
term facebook facebook count

Most positive co-occurrences for facial recognition

Table OD.7

facial_recognit facial_recognit_count
aibo 0.388224 95.0
byton 0.297190 145.0
dex 0.285977 476.0
doorbel 0.191735 1191.0
galaxi_s9 0.243350 808.0
googl_assist 0.246743 2537.0
iphon_xs 0.204627 2676.0
max 0.215127 1991.0
ml 0.199868 805.0
notch 0.222321 2026.0
oneplus 0.214217 813.0
pixel_4 0.217593 638.0
s9 0.226085 847.0
tencent 0.198929 880.0
thinkpad 0.261255 354.0
thunderbolt_3 0.289505 975.0
voic_assist 0.253189 1453.0
wireless_charg 0.227108 2196.0
xs 0.254027 787.0
zoom 0.245241 2140.0

Most negative co-occurrences for facial recognition

Table OD.8

facial_recognit facial_recognit_count
afr -0.017059 771.0
analytica 0.033960 125.0
autom_facial 0.028824 1310.0
autonom_weapon 0.033220 590.0
blacklist 0.031548 679.0
cardiff 0.006073 999.0
cbp 0.027427 749.0
chines_govern 0.037335 2754.0
custom_enforc 0.033937 2231.0
democraci 0.045571 2511.0
ftc 0.026918 1163.0
hikvis 0.019075 269.0
killer_robot 0.037154 811.0
lawmak 0.024536 4340.0
megvii 0.039383 669.0
metropolitan_polic 0.008278 1521.0
racial_bias 0.022877 1506.0
south_wale 0.010954 1447.0
uighur -0.007029 1136.0
xinjiang 0.018977 1227.0

20 most similar words to misinformation by skip-gram and continuous bag-of-words methods

Table OE.1

# misinformation misinformation_similarity_skipgram misinformation misinformation_similarity_cbow
1 disinformation 0.883826 disinformation 0.828316
2 propaganda 0.857456 propaganda 0.725279
3 falsity 0.797461 falsehood 0.708662
4 antirohingya 0.792485 sensation 0.664328
5 fakenews 0.776940 falsity 0.628542
6 antivax 0.771283 vitriol 0.628100
7 falsehood 0.753293 hoax 0.616483
8 extremist 0.751691 islamophobia 0.589717
9 hyperpartisan 0.751594 untruth 0.586440
10 fake 0.748578 malaria 0.571724
11 sensation 0.745698 extremist 0.565146
12 democracyd 0.744012 incivil 0.562336
13 hoax 0.740827 antivax 0.559144
14 downrank 0.740315 sensationalist 0.557074
15 foreignbought 0.737055 clickbait 0.554998
16 foment 0.731576 opioid 0.554302
17 prokremlin 0.730695 harass 0.553518
18 sensationalist 0.730553 antirohingya 0.548091
19 spread 0.726001 cyberbully 0.547763
20 radicalise 0.725107 antivaxx 0.546821
# misinformation misinformation_similarity_skipgram misinformation misinformation_similarity_cbow

20 most similar words to telecom by skip-gram and continuous bag-of-words methods

Table OE.2

# telecom telecom_similarity_skipgram telecom telecom_similarity_cbow
1 telecommunication 0.827904 telecommunication 0.733190
2 telco 0.823873 teleco 0.621462
3 teleco 0.745949 telco 0.598126
4 fastweb 0.740921 telia 0.563209
5 telekom 0.736889 paytv 0.559957
6 telecomm 0.732730 telecomm 0.555353
7 numericablesfr 0.719761 kpn 0.546445
8 telefonica 0.715725 statecontrol 0.542754
9 kpn 0.713414 autocomponent 0.532798
10 ericsson 0.712622 agrichem 0.531739
11 telefónica 0.710835 vistra 0.531067
12 tlit.mi 0.708989 chaebol 0.530784
13 telcom 0.708183 reli.n 0.526391
14 0728.hk 0.705815 tvs-to-nuclear 0.512156
15 hytera 0.698693 aerospace 0.508681
16 5g 0.694523 bdi 0.502585
17 equipment 0.692242 oran.pa 0.502112
18 zte 0.692005 telefonica 0.499620
19 rostelecom 0.689955 semiconductor 0.495913
20 anatel 0.687707 chinesebased 0.490151
# telecom telecom_similarity_skipgram telecom telecom_similarity_cbow

20 most similar words to gdpr by skip-gram and continuous bag-of-words methods

Table OE.3

# gdpr gdpr_similarity_skipgram gdpr gdpr_similarity_cbow
1 gdprs 0.815808 gdprs 0.645225
2 dataprotection 0.788226 ccpa 0.644862
3 ccpa 0.787765 dataprotection 0.552778
4 eprivacy 0.738384 coppa 0.549953
5 gdprlike 0.726634 legislation 0.549295
6 gdprstyle 0.711904 ir35 0.537033
7 gpdr 0.703906 euus 0.533341
8 dpa 0.703122 dpa 0.529481
9 eudatap 0.697647 ecj 0.525422
10 compliance 0.687931 privacyshield 0.525382
11 privacy 0.685799 gpdr 0.523836
12 pdpa 0.682532 cnil 0.513293
13 pregdpr 0.681566 eprivacy 0.502669
14 edpb 0.680758 dripa 0.501070
15 edp 0.677845 ecpa 0.494132
16 regulation 0.677562 euu. 0.491269
17 dpia 0.674435 law 0.485118
18 ferpa 0.670080 eu 0.479858
19 cjeu 0.669139 bipa 0.478978
20 eus 0.663556 tos 0.477664
# gdpr gdpr_similarity_skipgram gdpr gdpr_similarity_cbow

20 most similar words to blockchain by skip-gram and continuous bag-of-words methods

Table OE.4

# blockchain blockchain_similarity_skipgram blockchain blockchain_similarity_cbow
1 dlt 0.841446 dlt 0.700861
2 blockchainbased 0.810058 ai 0.634107
3 ledger 0.793925 ledger 0.609701
4 corda 0.769543 blockchainbased 0.601099
5 nexledger 0.767227 corda 0.562073
6 dapp 0.755415 pokitdok 0.561195
7 r3s 0.745285 dokchain 0.553961
8 decentralisation 0.743441 eosio 0.547927
9 blockchain distributed 0.741421 dlts 0.547524
10 dlts 0.740053 quantumsafe 0.541063
11 eosio 0.734448 dapp 0.536015
12 blocko 0.729792 po.et 0.534564
13 exonum 0.727048 decentralisation 0.520578
14 decentralisation 0.722612 photomatch 0.519107
15 pokitdok 0.719603 exscientia 0.517620
16 ethereumbased 0.718544 interledger 0.517271
17 euroclear 0.714574 cryptocurrency 0.515100
18 distributed ledger 0.713350 regtech 0.515047
19 ethereum 0.713333 acumo 0.508242
20 bitfury 0.711121 lifi 0.506971
# blockchain blockchain_similarity_skipgram blockchain blockchain_similarity_cbow

20 most similar words to surveillance by skip-gram and continuous bag-of-words methods

Table OE.5

# surveillance surveillance_similarity_skipgram surveillance surveillance_similarity_cbow
1 chinastyle 0.721142 data gathering 0.578075
2 spy 0.721105 crime fight 0.549618
3 cctv 0.706137 face recognition 0.548736
4 dragnet 0.698467 indiscriminately 0.545481
5 bodyworn 0.697809 military 0.543751
6 alpr 0.692501 oppress 0.525787
7 indiscriminately 0.685088 bodyworn 0.515463
8 facematch 0.684379 cctv 0.514979
9 warrantless 0.677355 cellphonetrack 0.513215
10 law enforcement 0.677026 adtarget 0.507493
11 cybersurveillance 0.668488 repress 0.506280
12 dirtbox 0.663059 mass surveillance 0.504072
13 facial recognition 0.662937 intelligence gathering 0.503398
14 mass surveillance 0.662037 illiberal 0.500629
15 stringray 0.657137 censorship 0.494181
16 snoop 0.655360 facial recognition 0.487126
17 intrusive 0.654282 antihack 0.486925
18 gatelike 0.651238 reconnaissance 0.486459
19 crime fight 0.650610 weaponry 0.480413
20 eyeinthesky 0.649750 biometric 0.477741
# surveillance surveillance_similarity_skipgram surveillance surveillance_similarity_cbow

20 most similar words to capital(ism) by skip-gram and continuous bag-of-words methods

Table OE.6

# capital(ism) capital(ism)_similarity_skipgram capital(ism) capital(ism)_similarity_cbow
1 venture 0.856083 fund 0.681806
2 vc 0.824091 yl 0.599331
3 steadview 0.813430 valar 0.571306
4 fund 0.808720 redpoint 0.570727
5 ribbit 0.802281 felicis 0.566737
6 beco 0.794095 dcm 0.561829
7 sequoia 0.791000 ivp 0.561614
8 venture capital 0.790335 eqt 0.559333
9 gaorong 0.786376 kima 0.559007
10 kreo 0.783774 otium 0.551082
11 tenaya 0.781438 nextview 0.551043
12 riverwood 0.781102 airtree 0.544061
13 wrvi 0.780353 8vc 0.544004
14 meritech 0.775807 shasta 0.543560
15 accelerator 0.775800 bgf 0.542881
16 portag3 0.775667 creandum 0.539565
17 wamda 0.774710 kkr 0.535284
18 hillhouse 0.773195 bluerun 0.535048
19 hostplus 0.772896 hippeau 0.528162
20 paua 0.772849 kaszek 0.525930
# capital(ism) capital(ism)_similarity_skipgram capital(ism) capital(ism)_similarity_cbow

LDA

Chart OF.1

Co-occurrence analysis

Chart OF.2

About NGI Forward

NGI Forward has received funding from the European Union's Horizon 2020 research and innovation programme under the Grant Agreement no 825652. The content of this website does not represent the opinion of the European Union, and the European Union is not responsible for any use that might be made of such content.

Contact Info

Copyright © All rights reserved | This template is made with by Colorlib