Page 1 of 12
Advances in Social Sciences Research Journal – Vol. 10, No. 8
Publication Date: August 25, 2023
DOI:10.14738/assrj.108.15231.
Kang, N. (2023). English as a Second Language in 350 YouTube Videos: A Big Data Analysis. Advances in Social Sciences Research
Journal, 10(8). 41-52.
Services for Science and Education – United Kingdom
English as a Second Language in 350 YouTube Videos:
A Big Data Analysis
Namkil Kang
Far East University, South Korea
ABSTRACT
The ultimate goal of this paper is to provide an in-depth analysis of 350 YouTube
videos broadcasted about English as a second language. A point to note is that one
noun obtains the highest frequency (1,439 tokens) in 350 YouTube videos. A further
point to note is that topic 9 was the most occurred one, followed by topic 7, topic 10,
and topic 16, in descending order. A major point of this paper is that the keyword
English is the most important and pivotal in the word cloud of 350 YouTube videos,
followed by the keyword language, and the keyword person, in that order. When it
comes to the use of core keywords, the keyword English was the most frequently
used in 350 YouTube videos, followed by the keyword language, the keyword video,
the keyword paper, and the keyword com, in that order. This paper also argues that
the keyword English obtains the highest centrality and frequency and the keyword
language follows. When it comes to the keyword SSLC, its centrality is high (the third
highest), but its frequency is relatively low. The so-called frequency and centrality
both refer to the degree of the importance of words, but they are different in that
the former refers to the use of words, whereas the latter refers to the importance,
prominence, prestige, influence, and reputation of words.
Keywords: English as a second language, NetMiner, word cloud, frequency, centrality,
topic, keyword
INTRODUCTION
The main goal of this paper is to provide an in-depth analysis of 350 YouTube videos and their
comments broadcasted about English as a second language from July 2022 to July 2023. We
used the YouTube data collector so as to collect 350 YouTube videos and their comments. Also,
we used the software package NetMiner to analyze 350 YouTube videos and their comments.
First, we aim to go over the frequency, proportion, and cumulative proportion of all nouns that
were used in 350 YouTube videos and their comments.Second, we aim at probing into 17 topics
and their keywords that occurred in 350 YouTube videos. Also, we look into the use and
frequency of 17 topics that constitute 350 YouTube videos. By doing so, we can see which topic
was the most widely used one. Third, we aim to contemplate the so-called word cloud through
which we can see which central keywords were used in 350 YouTube videos. Fourth, we aim at
inquiring into the use and frequency of important keywords in descending order. Important
and central words are bound to have high frequency, which we take as indicating that they are
pivotal in 350 YouTube videos. Finally, we provide and inquire into the map of centrality which
refers to the centrality of words. Notice that words can have low frequency, but they can have
high centrality which central and pivotal words are bound to have. Put differently, we can see
which words are important and pivotal in terms of their frequency or centrality.
Page 2 of 12
42
Advances in Social Sciences Research Journal (ASSRJ) Vol. 10, Issue 8, August-2023
Services for Science and Education – United Kingdom
RESULTS
Results of Frequency
In what follows, we probe into the use and frequency of all nouns that occurred in 350 YouTube
videos. Table 1 shows the use, proportion, and cumulative proportion of all nouns that were
used in 350 YouTube videos:
Table 1 Results of frequency
Value Frequency Proportion Cumulative Proportion
1.0 1439 0.432 0.432
2.0 457 0.137 0.57
3.0 200 0.06 0.63
4.0 148 0.044 0.674
5.0 87 0.026 0.7
6.0 109 0.033 0.733
7.0 74 0.022 0.755
8.0 64 0.019 0.775
9.0 47 0.014 0.789
10.0 38 0.011 0.8
11.0 22 0.007 0.807
12.0 35 0.011 0.817
13.0 44 0.013 0.831
14.0 21 0.006 0.837
15.0 15 0.005 0.841
16.0 22 0.007 0.848
17.0 16 0.005 0.853
18.0 13 0.004 0.857
19.0 13 0.004 0.861
20.0 17 0.005 0.866
21.0 10 0.003 0.869
22.0 12 0.004 0.872
23.0 11 0.003 0.876
24.0 15 0.005 0.88
25.0 7 0.002 0.882
26.0 12 0.004 0.886
27.0 7 0.002 0.888
28.0 7 0.002 0.89
29.0 9 0.003 0.893
30.0 9 0.003 0.895
31.0 9 0.003 0.898
32.0 9 0.003 0.901
33.0 8 0.002 0.903
34.0 5 0.002 0.905
35.0 3 0.001 0.906
36.0 5 0.002 0.907
37.0 6 0.002 0.909
38.0 4 0.001 0.91
39.0 5 0.002 0.912
Page 3 of 12
43
Kang, N. (2023). English as a Second Language in 350 YouTube Videos: A Big Data Analysis. Advances in Social Sciences Research Journal, 10(8). 41-
52.
URL: http://dx.doi.org/10.14738/assrj.108.15231
40.0 3 0.001 0.913
41.0 3 0.001 0.913
42.0 4 0.001 0.915
43.0 4 0.001 0.916
44.0 5 0.002 0.917
45.0 3 0.001 0.918
46.0 7 0.002 0.92
47.0 2 0.001 0.921
48.0 6 0.002 0.923
49.0 3 0.001 0.924
50.0 1 0 0.924
51.0 7 0.002 0.926
52.0 7 0.002 0.928
54.0 5 0.002 0.93
55.0 1 0 0.93
56.0 2 0.001 0.931
57.0 3 0.001 0.931
58.0 3 0.001 0.932
59.0 2 0.001 0.933
60.0 4 0.001 0.934
61.0 5 0.002 0.936
62.0 6 0.002 0.938
63.0 3 0.001 0.938
64.0 1 0 0.939
65.0 1 0 0.939
66.0 4 0.001 0.94
67.0 2 0.001 0.941
69.0 3 0.001 0.942
70.0 3 0.001 0.943
71.0 2 0.001 0.943
72.0 3 0.001 0.944
73.0 1 0 0.944
75.0 4 0.001 0.946
76.0 2 0.001 0.946
77.0 1 0 0.947
78.0 2 0.001 0.947
79.0 4 0.001 0.948
80.0 2 0.001 0.949
81.0 3 0.001 0.95
82.0 1 0 0.95
83.0 3 0.001 0.951
84.0 2 0.001 0.952
85.0 6 0.002 0.953
87.0 3 0.001 0.954
88.0 1 0 0.955
89.0 1 0 0.955
90.0 2 0.001 0.956