Page 1 of 12

Advances in Social Sciences Research Journal – Vol. 10, No. 8

Publication Date: August 25, 2023

DOI:10.14738/assrj.108.15231.

Kang, N. (2023). English as a Second Language in 350 YouTube Videos: A Big Data Analysis. Advances in Social Sciences Research

Journal, 10(8). 41-52.

Services for Science and Education – United Kingdom

English as a Second Language in 350 YouTube Videos:

A Big Data Analysis

Namkil Kang

Far East University, South Korea

ABSTRACT

The ultimate goal of this paper is to provide an in-depth analysis of 350 YouTube

videos broadcasted about English as a second language. A point to note is that one

noun obtains the highest frequency (1,439 tokens) in 350 YouTube videos. A further

point to note is that topic 9 was the most occurred one, followed by topic 7, topic 10,

and topic 16, in descending order. A major point of this paper is that the keyword

English is the most important and pivotal in the word cloud of 350 YouTube videos,

followed by the keyword language, and the keyword person, in that order. When it

comes to the use of core keywords, the keyword English was the most frequently

used in 350 YouTube videos, followed by the keyword language, the keyword video,

the keyword paper, and the keyword com, in that order. This paper also argues that

the keyword English obtains the highest centrality and frequency and the keyword

language follows. When it comes to the keyword SSLC, its centrality is high (the third

highest), but its frequency is relatively low. The so-called frequency and centrality

both refer to the degree of the importance of words, but they are different in that

the former refers to the use of words, whereas the latter refers to the importance,

prominence, prestige, influence, and reputation of words.

Keywords: English as a second language, NetMiner, word cloud, frequency, centrality,

topic, keyword

INTRODUCTION

The main goal of this paper is to provide an in-depth analysis of 350 YouTube videos and their

comments broadcasted about English as a second language from July 2022 to July 2023. We

used the YouTube data collector so as to collect 350 YouTube videos and their comments. Also,

we used the software package NetMiner to analyze 350 YouTube videos and their comments.

First, we aim to go over the frequency, proportion, and cumulative proportion of all nouns that

were used in 350 YouTube videos and their comments.Second, we aim at probing into 17 topics

and their keywords that occurred in 350 YouTube videos. Also, we look into the use and

frequency of 17 topics that constitute 350 YouTube videos. By doing so, we can see which topic

was the most widely used one. Third, we aim to contemplate the so-called word cloud through

which we can see which central keywords were used in 350 YouTube videos. Fourth, we aim at

inquiring into the use and frequency of important keywords in descending order. Important

and central words are bound to have high frequency, which we take as indicating that they are

pivotal in 350 YouTube videos. Finally, we provide and inquire into the map of centrality which

refers to the centrality of words. Notice that words can have low frequency, but they can have

high centrality which central and pivotal words are bound to have. Put differently, we can see

which words are important and pivotal in terms of their frequency or centrality.

Page 2 of 12

42

Advances in Social Sciences Research Journal (ASSRJ) Vol. 10, Issue 8, August-2023

Services for Science and Education – United Kingdom

RESULTS

Results of Frequency

In what follows, we probe into the use and frequency of all nouns that occurred in 350 YouTube

videos. Table 1 shows the use, proportion, and cumulative proportion of all nouns that were

used in 350 YouTube videos:

Table 1 Results of frequency

Value Frequency Proportion Cumulative Proportion

1.0 1439 0.432 0.432

2.0 457 0.137 0.57

3.0 200 0.06 0.63

4.0 148 0.044 0.674

5.0 87 0.026 0.7

6.0 109 0.033 0.733

7.0 74 0.022 0.755

8.0 64 0.019 0.775

9.0 47 0.014 0.789

10.0 38 0.011 0.8

11.0 22 0.007 0.807

12.0 35 0.011 0.817

13.0 44 0.013 0.831

14.0 21 0.006 0.837

15.0 15 0.005 0.841

16.0 22 0.007 0.848

17.0 16 0.005 0.853

18.0 13 0.004 0.857

19.0 13 0.004 0.861

20.0 17 0.005 0.866

21.0 10 0.003 0.869

22.0 12 0.004 0.872

23.0 11 0.003 0.876

24.0 15 0.005 0.88

25.0 7 0.002 0.882

26.0 12 0.004 0.886

27.0 7 0.002 0.888

28.0 7 0.002 0.89

29.0 9 0.003 0.893

30.0 9 0.003 0.895

31.0 9 0.003 0.898

32.0 9 0.003 0.901

33.0 8 0.002 0.903

34.0 5 0.002 0.905

35.0 3 0.001 0.906

36.0 5 0.002 0.907

37.0 6 0.002 0.909

38.0 4 0.001 0.91

39.0 5 0.002 0.912

Page 3 of 12

43

Kang, N. (2023). English as a Second Language in 350 YouTube Videos: A Big Data Analysis. Advances in Social Sciences Research Journal, 10(8). 41-

52.

URL: http://dx.doi.org/10.14738/assrj.108.15231

40.0 3 0.001 0.913

41.0 3 0.001 0.913

42.0 4 0.001 0.915

43.0 4 0.001 0.916

44.0 5 0.002 0.917

45.0 3 0.001 0.918

46.0 7 0.002 0.92

47.0 2 0.001 0.921

48.0 6 0.002 0.923

49.0 3 0.001 0.924

50.0 1 0 0.924

51.0 7 0.002 0.926

52.0 7 0.002 0.928

54.0 5 0.002 0.93

55.0 1 0 0.93

56.0 2 0.001 0.931

57.0 3 0.001 0.931

58.0 3 0.001 0.932

59.0 2 0.001 0.933

60.0 4 0.001 0.934

61.0 5 0.002 0.936

62.0 6 0.002 0.938

63.0 3 0.001 0.938

64.0 1 0 0.939

65.0 1 0 0.939

66.0 4 0.001 0.94

67.0 2 0.001 0.941

69.0 3 0.001 0.942

70.0 3 0.001 0.943

71.0 2 0.001 0.943

72.0 3 0.001 0.944

73.0 1 0 0.944

75.0 4 0.001 0.946

76.0 2 0.001 0.946

77.0 1 0 0.947

78.0 2 0.001 0.947

79.0 4 0.001 0.948

80.0 2 0.001 0.949

81.0 3 0.001 0.95

82.0 1 0 0.95

83.0 3 0.001 0.951

84.0 2 0.001 0.952

85.0 6 0.002 0.953

87.0 3 0.001 0.954

88.0 1 0 0.955

89.0 1 0 0.955

90.0 2 0.001 0.956