Page 1 of 14
Advances in Social Sciences Research Journal – Vol. 10, No. 2
Publication Date: February 25, 2023
DOI:10.14738/assrj.102.13950. Kang, N. (2023). K-POP in BBC News: A Big Data Analysis.Advances in Social Sciences Research Journal, 10(2). 156-169.
Services for Science and Education – United Kingdom
K-POP in BBC News: A Big Data Analysis
Namkil Kang
Far East University, South Korea
ABSTRACT
The main goal of this paper is to analyze 40 pieces of BBC news broadcasted from
4th, 1, 2020 to 21st, 12, 2022 regarding K-pop. As a theoretical tool for this goal, we
used the software package NetMiner. A point to note is that one word has the highest
frequency (1,005 tokens) and the highest proportion (0.507). A further point to
note is that in the word cloud, the noun BTS occurs as the biggest in size. This in turn
implies that it is the most frequent one and thus it counts as the most significant.
With respect to topics occurred in 40 pieces of BBC news, it is interesting to note
that topic 1 was the most widely used one, followed by topic 8, topic 7, topic 2, and
topic 9, in that order. Talking about major words occurred in 40 pieces of BBC news,
the word BTS was the most occurred one, followed by the word fan, the word group,
the word band, and the word year (the word member), in descending order. This
paper argues, on the other hand, that the word BTS has the highest in-degree
centrality (0.454545). This in turn indicates that the word BTS counts as the most
significant or popular. This paper further argues that the word BTS has the highest
in-closeness centrality (0.594310). More specifically, the distance among the word
BTS and other nodes is the shortest, thereby counting as the most significant and
important.
Keywords: K-pop, NetMiner, topic, keyword, map, centrality, BBC
INTRODUCTION
The main purpose of this paper is to analyze 40 pieces of BBC news broadcasted from 4th, 1,
2020 to 21st, 12, 2022 regarding K-pop. As a research tool for our goal, we used the software
package NetMiner. We, by using the NetMiner, analyzed big data (40 pieces of BBC news). First,
we provide information on the frequency and proportion of all words occurred in 40 pieces of
BBC news. Second, we aim at providing word cloud in which all keywords are represented in
different sizes, depending on their frequency. Third, we aim to provide 12 topics constituting
40 pieces of BBC news and their keywords consisting of each topic. Also, we concentrate on
examining the frequency of each topic through which we can see which topics are the preferred
ones in 40 pieces of BBC news. Additionally, we provide the visualization of which keywords
are linked to each topic. Fourth, we aim at considering the frequency of main nouns used in 40
pieces of BBC news. Fifth, we look into degree centrality (the term of NetMiner) and provide its
map. The so-called degree centrality refers to the number of the directly linked neighbors. Sixth,
we inquire into closeness centrality (the term of NetMiner) and provide its map. The closeness
centrality refers to the distance among a node and other nodes: The more the distance among
a node and other nodes is close, the more the value of closeness centrality is high. The
organization of this paper is as follows. In section 2.1, we argue that one word has the highest
frequency (1,005 tokens) and the highest proportion (0.507). In section 2.2, we further argue
that in the word cloud, the noun BTS occurs as the biggest in size. This in turn indicates that it
Page 2 of 14
157
Kang, N. (2023). K-POP in BBC News: A Big Data Analysis.Advances in Social Sciences Research Journal, 10(2). 156-169.
URL: http://dx.doi.org/10.14738/assrj.102.13950
is the most frequently used one and thus it is regarded as the most significant. In section 2.3,
we maintain that topic 1 was the most frequently used one, followed by topic 8, topic 7, topic 2,
and topic 9, in that order. In section 2.4, we contend that the word BTS was the most occurred
one, followed by the word fan, the word group, the word band, and the word year (the word
member), in descending order. In section 2.5, the value of the in-degree centrality of main
words is provided. Quite interestingly, the value of the word BTS is the highest (0.454545). This
in turn suggests that the word BTS counts as the most significant or popular. In section 2.6, we
argue that the word BTS has the highest in-closeness centrality (0.594310). This in turn
indicates that the distance among the word BTS and other nodes is the shortest, thereby
counting as the most significant and important.
INFORMATION OF THE FREQUENCY AND PROPORTION OF ALL NOUNS
In what follows, we aim at providing information on the frequency and proportion of all words
occurred in 40 pieces of BBC news:
Table 1 Info on the frequency and proportion of all nouns
Value Frequency Proportion Cumulative
Proportion
1.0 1005 0.507 0.507
2.0 356 0.18 0.686
3.0 174 0.088 0.774
4.0 120 0.061 0.835
5.0 62 0.031 0.866
6.0 43 0.022 0.888
7.0 26 0.013 0.901
8.0 28 0.014 0.915
9.0 18 0.009 0.924
10.0 20 0.01 0.934
11.0 19 0.01 0.944
12.0 10 0.005 0.949
13.0 6 0.003 0.952
Page 3 of 14
158
Advances in Social Sciences Research Journal (ASSRJ) Vol. 10, Issue 2, February-2023
Services for Science and Education – United Kingdom
14.0 13 0.007 0.958
15.0 6 0.003 0.961
16.0 6 0.003 0.964
17.0 7 0.004 0.968
18.0 2 0.001 0.969
19.0 3 0.002 0.97
20.0 6 0.003 0.973
21.0 3 0.002 0.975
22.0 5 0.003 0.977
23.0 2 0.001 0.978
24.0 6 0.003 0.981
26.0 4 0.002 0.983
27.0 4 0.002 0.985
30.0 2 0.001 0.986
31.0 3 0.002 0.988
34.0 2 0.001 0.989
36.0 1 0.001 0.989
42.0 1 0.001 0.99
46.0 1 0.001 0.99
47.0 1 0.001 0.991
48.0 1 0.001 0.991
49.0 2 0.001 0.992
Page 4 of 14
159
Kang, N. (2023). K-POP in BBC News: A Big Data Analysis.Advances in Social Sciences Research Journal, 10(2). 156-169.
URL: http://dx.doi.org/10.14738/assrj.102.13950
55.0 1 0.001 0.993
56.0 1 0.001 0.993
59.0 3 0.002 0.995
70.0 1 0.001 0.995
71.0 1 0.001 0.996
75.0 1 0.001 0.996
83.0 1 0.001 0.997
94.0 1 0.001 0.997
95.0 1 0.001 0.998
142.0 1 0.001 0.998
160.0 1 0.001 0.999
194.0 1 0.001 0.999
294.0 1 0.001 1
Total 1983 1
It is worth pointing out that there occurred one word whose frequency is 1,005 tokens (the
highest). Note that its proportion and its cumulative proportion are the same, namely 0.507. It
is also interesting to consider rank-two. It is probably worthwhile pointing out that there
occurred two words whose frequency is 356 tokens (the second highest). More specifically,
their proportion and cumulative proportion are 0.18 and 0.686, respectively. It must be
stressed, on the other hand, that three words have high frequency and proportion. To be more
specific, their frequency is 174 tokens and their proportion is 0.088. It is also worth noting that
there appeared four words whose frequency is 120 tokens. Notice that their proportion and
cumulative proportion are 0.061 and 0.835. Finally, it is interesting to observe five words. Their
frequency is 62 tokens and their proportion is 0.031. We thus conclude that one word has the
highest frequency (1,005 tokens) and the highest proportion (0.507).
Word Cloud
In the following, we aim to provide word cloud representing 40 pieces of BBC news in which
key nouns occur in different sizes, depending on their frequency:
Page 5 of 14
160
Advances in Social Sciences Research Journal (ASSRJ) Vol. 10, Issue 2, February-2023
Services for Science and Education – United Kingdom
Figure 1 Word cloud
It is particularly noteworthy that the noun BTS occurs as the biggest in size. This in turn shows
that it is the most frequent one and thus it counts as the most significant keyword. It is
interesting to observe, on the other hand, that the noun fan appears as the second biggest in
size, thereby being regarded as the second most important one. It is also worth mentioning that
the noun group occurs as the third biggest in size. This in turn means that it is the third most
frequent and thus it is seen as the third most important keyword. It is also interesting to
consider the noun band. It is the fourth biggest in size, thereby implying that it is the fourth
most significant keyword. Finally, it is important to note that the nouns member and year occur
as the fifth biggest in size (the same), thus counting as equal in the degree of importance. We
thus conclude that the nouns BTS is the biggest in size, which in turn indicates that it is the most
frequent in 40 pieces of BBC news.
Topics and Keywords
The goal of this section is to investigate 12 topics constituting 40 pieces of BBC news and 5
keywords that are made up of each topic. Also, this section centers on examining the frequency
of each topic in 40 pieces of BBC news. Additionally, this section is focused on providing the
visualization of which keywords are associated with each topic. Table 2 shows 12 topics
consisting of 40 pieces of BBC news and 5 keywords forming each topic:
Page 6 of 14
161
Kang, N. (2023). K-POP in BBC News: A Big Data Analysis.Advances in Social Sciences Research Journal, 10(2). 156-169.
URL: http://dx.doi.org/10.14738/assrj.102.13950
Table 2 Topic info
1st
Keyword
2nd
Keyword
3rd
Keyword
4th
Keyword
5th
Keyword
Topic-1 fan BTS boy album show
Topic-2 company Big Hit share label
Topic-3 band group music time world
Topic-4 person band song group country
Topic-5 video record YouTube song hour
Topic-6 BTS South
Korea service band member
Topic-7 chart number album US Billboard
Topic-8 member group BTS RM Jin
Topic-9 trainee group star company dance
Topic-10 Kpop band artist star music
Topic-11 BTS year artist world industry
Topic-12 world pop fan music year
It is interesting to note that topic 1 contains the keywords fan, BTS, boy, album, and show. As
can be seen from Table 2, the 1st keyword is the noun fan, which is regarded as the most
occurred keyword. It should also be pointed out that the keywords BTS, South Korea, service,
band, and member are made up of topic 6. In this topic, the 1st keyword is the word BTS, which
is assumed to be the most frequently used one in topic 6. It is also interesting to observe topic
7. Topic 7 includes the keywords chart, number, album, US, and Billboard. In this topic, the
keyword chart is considered to be the most used one. It is worth noting, on the other hand, that
the keywords trainee, group, star, company, and dance consist of topic 9. It is also important to
mention that the keywords world, pop, fan, music, and year constitute topic 12. As expected,
the keyword world counts as the most occurred one in topic 12.
Now let us consider how often each topic occurs in 40 pieces of BBC news: