Observing coronavirus information on YouTube: network and content analysis of the U.S., Korea, India, and Mexico
Article information
Abstract
Diverse contents are generated from a wide variety of sources online as new media platforms give people the capacity to create, reconfigure, and spread information in unprecedented amounts. To help users overcome this information overload, online platforms take user preferences and attitudes into account and use this information to program algorithms that mediate and facilitate content promotion (Cinelli et al., 2020). YouTube can also determine what people watch with its recommendation algorithm that is based on user responses and settings such as geography. In this sense, considering the differences in user reaction and recommendations in four different countries during the coronavirus pandemic is the main contribution that this research makes in researching YouTube. Specifically, this study questions how YouTube videos are connected by content type, and how the connection affects user responses in different countries. With the use of a mixed method approach that combines network and content analysis, this study describes the spread of coronavirus information on YouTube in the United States, Mexico, Korea and India, focusing on how differing interactions can reconfigure the architecture of the YouTube network in different countries.
Background
The recent coronavirus outbreak that began in early 2020 is the biggest health emergency the world has faced in over a century. When the pandemic forced countries into lockdowns with stay-at-home orders and social distancing guidelines, the situation paved a way for an unprecedented digital acceleration. People have turned to online platforms to stay connected and to move towards the new normal. During the pandemic, social networking services have proved to be very crucial in spreading awareness about the virus. Social networking platforms are flooded with ideas, health tips, and ways of treating illnesses, and this has proven to be the case with the recent pandemic. Since the Internet provides easy access to sources containing health information, YouTube has especially developed into a dominant platform for spreading health information (D’Souza et al., 2020; Vance et al., 2009).
YouTube, with currently over 2 billion users worldwide, has played a very important role during this crisis. The online video-sharing platform provides unlimited space for organizations and individuals to produce and upload videos. Along with the increasing amount of content on YouTube, people watch over a billion hours of video and generate billions of views every day.I YouTube can be accessed in a total of 80 languages and has launched local versions in more than 100 countries. This massive global presence makes it one of most powerful media platforms today.
To help users navigate through the tremendous amount of information on YouTube, its recommendation system provides broad personalization via collaborative filtering (Covington et al, 2016). Given the way the platform operates on the basis of user selection, YouTube's algorithm increasingly filters videos depending on the behavior of users. While it is difficult to know precisely how the recommendation algorithm behaves, it seems likely that recommended videos align with content that users have already watched (Masadeh & Hamilton, 2020). Also, YouTube’s recommendation system provides broad personalization via collaborative filtering (Covington et al, 2016). Airoldi, Beraldo and Gandini (2016, pg. 2) argue that clusters that are formed on YouTube can be considered as “products of crowd-generated principles of similarity”, which defends the idea that sets identified in a network represent aggregated data on the different cultural interest and intentions that users may have.
This study takes YouTube as the main subject of analysis, and questions how its videos, especially on that of the recent coronavirus pandemic, are connected in terms of content and source types, and how the connection affects viewer responses in multiple cultural contexts. This study will use a mixed method approach of network and content analysis to describe the spread of coronavirus information on YouTube in the United States, South Korea, India, and Mexico.
Literature Review
YouTube and health communication
YouTube provides space for users from diverse cultural backgrounds to contribute and interact with videos, and health-related contents are one of the most frequently viewed contents on YouTube. According to a survey conducted by the Pew Research Center, 75 percent of patients acquired knowledge on how to treat their condition by searching health information online (Pew Research Center, 2006). The result suggests that a platform like YouTube can serve as an important channel for sharing and disseminating timely health-related information (Madathil et al., 2014). In recent years, YouTube has actually become a very significant platform for health communication (Briones, 2012), and researchers have begun to study the role of YouTube in relation to health issues. For example, studies on YouTube as a source of information for immunization and vaccines (Briones, 2012; Donzelli et al., 2018; Keelan et al., 2007), smoking (Freeman & Chapman, 2007; Paek et al., 2010), and obesity (Yoo & Kim, 2012; Zhan et al., 2014) were conducted, and the results of these studies suggest that YouTube is a major vehicle for disseminating health information.
Especially in the context of a pandemic, digital communication presents an opportunity for faster information flow. For example, during the 2009 H1N1 epidemic, more than 3 million views for the CDC-produced H1N1 videos broadcast on YouTube indicate that a significant number of individuals were interested in information relevant to the epidemic. The increasing reliance on such information has led to the emphasis on the importance of public health entities being willing to adapt to the changing nature of information dissemination in the face of innovative technologies and new media (Walton, Seitz, & Ragsdale, 2012).
However, in March of 2020, the World Health Organization (WHO, 2020) warned about the massive spread of misinformation that caused many episodes of confusion around the world during the pandemic. This is mainly a result of giving capacity for any communicating subject to act on the communication network and giving people and organizations the possibility of reconfiguring the network and even the information according to their needs, desires, and interests (Castells, 2004). Social networking platforms such as YouTube, Facebook, Twitter, and Reddit have given people the capacity to reconfigure and spread information in unprecedented forms. The new instability generated by this new communication war-game has generated a so-called “new word-of-mouth culture” (Xin, 2020). The coronavirus pandemic has produced examples that have exacerbated the weaknesses and instability of this new movement of information (Christians et al., 2020).
As users interact more with these media platforms, they take user's preferences and attitudes into account, and use this information to program algorithms that mediate and facilitate content promotion and thus information spread (Cinelli et al., 2020). Information conveyed on social media and the sharing of such information is now intensively linked to the algorithmically driven models of linking individual activities to other echo-chambers of information. Castells (2004) refers to the digital era as time defined by computational programs that assign to each network its objectives and rules of operation through algorithms. Programs are based on codes that include the evaluation of behavior and monitoring criteria and depend on communication protocols specifically designed and based on those algorithms. In the case of YouTube, the algorithm decides what people watch on YouTube 70% of the time, and according to Pew Research Center, 81% of American YouTube users say they regularly watch videos recommended by the algorithm (Cooper, 2020).
Use of YouTube as health channels
As the aim of this study is to observe YouTube in different cultural contexts – specifically the different patterns of coronavirus information on YouTube – this study observes four different countries: the United States, South Korea, India, and Mexico.
United States
As of 2019, around three-quarters of U.S. adults said they use YouTube, and half of the adults who use YouTube said that it is an important source for retrieving information about things they haven’t seen or done before (Van Kessel, 2019). The Health Information National Trends Survey reported that 75% of Americans refer to the Internet first when searching for health or medical information (National Cancer Institute, 2020). YouTube is one of the most popular online platforms that has emerged as a significant source of health information in the U.S., with public health agencies investing resources in YouTube as a channel for health communication. Many state health departments (SHDs) in the U.S. use media for health promotion. Even in 2011, as many as 60% of SHDs had at least one social media application; 86.7% were on Twitter, 56% were on Facebook, and 43% were on YouTube (Thackeray et al., 2012).
South Korea
YouTube is a widely used platform in Korea, with the average use time surpassing that of other widely used social networking services and messenger apps. In the case of Korea, the growth of YouTube has soared, with increased viewing time (Chrichton, 2020). Specifically, statistics show that 83% of the Korean population are on YouTube, and the average monthly usage time was 29.5 hours, which is 2.5 times that of KakaoTalk (12 hrs.) and Facebook (11.7 hrs.). Also, the time is significantly higher than Naver (10.2), a widely used search portal site (Oh, 2020). After the spread of the coronavirus infections, interest in healthcare has surged, and many media users in Korea have turned to YouTube to find health-related information. YouTube has changed its role from an entertainment platform to an important medium for conveying information. Such demand for quality health information on YouTube has motivated credible sources to become active participants on YouTube, providing their expertise on health issues. For example, doctors have come together to provide factual and credible information on the coronavirus on their channel ‘DoctorFriends’, and also the Korean Medical Association has been releasing YouTube videos related to the coronavirus on its YouTube channel.
Mexico
In the case of Mexico, based on figures reported by the World Internet Stats, approximately 85 million inhabitants have access to the Internet. Though not as high as other countries, Mexico has a solid number of users who spend more than an hour or two on YouTube (Navarro, 2020). According to statistics released by a marketing firm in 2019, YouTube shared up to 19% of the video consumption market in Mexico (Riquelme, 2019). This means that YouTube has a larger video audience than pay TV, any broadcast TV channel, streaming video platforms and all the most popular social networks, such as Facebook, Twitter and Instagram. In addition to this, based on Ipsos data cited by Google (2019), 92% of adults who connect to the Internet in Mexico access YouTube, which is 20% more than the United States, the birthplace of the company, where 73% of adults access it (Van Kessel, 2019). In general, 86% of users surveyed in Argentina, Chile, Colombia, Peru and Mexico have increased their use of YouTube since the beginning of the pandemic, according to research conducted by Talkshoppe, but specifically in Mexico, there was a 130% increase in hours of content uploaded to the platform, while the number of subscribers to Mexican channels grew (Ramos, 2020).
India
YouTube has over 325 million monthly active users over the age of 18 in India. Over the last year, there has been a rise in regional content on YouTube and it has become the most used digital media platform in rural India, according to CSC e-Governance Services data as reported by the Economic Times (Agarwal, 2020). Lockdown and restrictions from the pandemic led to the surge in data consumption. People across India turned to YouTube for information about the coronavirus. During the lockdown in India, coronavirus related content saw a surge of 98 percent in terms of views and 199 percent in terms of engagement and 130 percent increase in uploads, according to the report by Mindshare-Vidooly (2021). News channels, government agencies, influencers, celebrities, doctors and experts from health, fitness and other industries also published content around the coronavirus. For example, Apollo Hospitals published a video explaining the virus in Hindi language which received over 1 million views.
Researching YouTube’s network
Media in the network society present a large variety of channels of communication with increasing interactivity. These channels do not constitute a unified, centralized culture, but are inclusive of a wide range of cultures and social groups; and send targeted messages to selected audiences. Such media systems are characterized by diversification of the audience, by technological versatility and channel multiplicity, and by the growing autonomy of the audience (Castells, 2004). Jenkins (2006) refers to this phenomenon as the “participatory culture” that is observed thanks to the interconnection between distribution channels, platforms, and technologies. Jenkins uses this term to describe a process of cultural transformation that affects the way media are used. Thus, unlike older forms of media that were characterized primarily by unidirectional exchange, games, forums, websites, YouTube videos bring to light the opportunity for active multidirectional engagement that encourages reticular forms of production. Participatory culture describes how software and online infrastructure (platforms) enable creative expression and its sharing with others so that it can be valued (Rogers, 2019).
YouTube is one of the largest search engines in the world by search volume, second only to Google, and has a search algorithm and recommendation system based on the principle of collaborative filtering, designed to help users navigate the millions of contents available on its site (Marchal et al., 2020). YouTube’s recommendation system provides broad personalization via collaborative filtering (Covington et al, 2016). Based on Airoldi, Beraldo and Gandini’s study (2016), clusters can be considered “crowd generated categories” and “crowd individual activity”, which defends the idea that the sets that are identified represent aggregated data on the different cultural interest and intentions that users may have. A study on the locality of online YouTube videos showed that despite the global nature of the web, online video consumption is likely to be constrained by geographic locality of interest (Broderson & Scellato, 2012). This is because YouTube makes personalized video recommendations from a large database of content with deep neural networks that consider user’s search/watch history, and also user’s geographic region, device, gender, logged-in status, and age as additional data points (Covington et al., 2016).
Network analysis is a set of techniques that allows researchers to observe relations that form among actors, and to analyze the social structure that forms from the recurrence of these relations (Chiesi, 2015). Network analysis takes into account the assumption that social phenomena are better explained through analysis of the relations that are formed among its entities. Such techniques can be applied to observe how the network of different entities inside media platforms affect the diverse sociocultural phenomena that arise from these platforms. Therefore, network analysis has been used to detect different clusters that are formed inside YouTube.
Accordingly, network analysis is used to observe information diffusion through social networks. In the case of YouTube, network analysis enables researchers to detect relationships between videos, and see how different video characteristics can influence the spread of information. Semantic analysis is conducted along with network analysis to reveal characteristics of contents (Airoldi et al., 2016; Lee et al, 2018). Prior studies have observed issue networks, and also comment networks on YouTube. Murthy and Sharma (2018) examined YouTube’s comment space using a network-based approach to identify and characterize user exchanges and understand the types of interactions.
Analyzing the network society in times of the pandemic implies considering the most assiduous communication channels such as the case of YouTube, where the dissemination of videos and information also achieves a high and very varied penetration of audiences. In this sense, the consideration in this study of the differences in attitudes and reactions of users of social networks in four different countries is the main contribution that this analysis makes within recent studies of social network communication during the pandemic. The network develops in multiple cultural settings, which materializes in specific forms leading to the formation of highly diverse institutional systems. The network society keeps its networked organization at the global level at the same time as it makes itself specific in every society (Castells, 2004). Thus, social-digital networks converted into complex network societies have the capacity to reconfigure to adapt to specific cultural parameters. But in emergency conditions these parameters can also present in different clusters notoriously similar to forms that can only come from human behaviors that go beyond cultures, and rather focus on emotions that an emergency situation can arouse at a very elemental level in the human being.
Based on the literature review, this study focuses on observing what type of contents – or subject matters – are seen in YouTube videos on the coronavirus outbreak, how they are connected by YouTube’s algorithm, and how they may differ by country by using network analysis techniques. Additionally, this study will look at what the main sources are for the content types and how users engage with different types of content. Therefore, this study presents three research questions as below.
RQ1: How does YouTube make recommendations on the coronavirus, and how does it differ by country?
RQ2: What kind of content and source types are seen on YouTube in concern to the pandemic, and how does it differ by country?
RQ3: How do users engage with such content, and how does it differ by country?
Research Method
In order to observe how information is being distributed and consumed on YouTube, this study uses a mix of network analysis and content analysis. This mixed method approach has its basis on the digital methods approach (Rogers, 2019), which was applied in a prior study to explore music reception and recommendation on YouTube (Airoldi, Beraldo, & Gandini, 2016). This study implements a similar method to explore recommendation and reception of coronavirus information on YouTube.
Data collection and preparation
Data for this study was collected using the YouTube Data ToolII, an open-source tool provided by the Digital Methods Initiative at University of Amsterdam (Rieder, 2015). This tool uses the YouTube API v3 to provide several modules for crawling extensive data from YouTube. In this study, the researchers specifically used the ‘video network’ module, which returns a list of YouTube videos and creates a network of relations between the videos in the list via YouTube’s “related videos” feature for a specific search query. In addition, the module provides basic quantitative metrics for each video – including number of views, likes, dislikes, comments, and the published date. The tool is helpful for researchers to break down how YouTube’s recommendation system works and how it presents (or recommends) information to users, which is one of the main goals of this study.
Data collection was completed by researchers in each country during the first week of July. In this study, only one term (‘coronavirus’) was translated into each language then was used as a search query to maintain consistency in data collection across the four countries. The time frame of the seed videos was set to January 1st to June 30th, 2020 – the first six months of the coronavirus outbreak. The language and regional codes were set to match each country the data was being collected from. Fifty seed videos were initially crawled based on “relevance” to the search query, and crawl depth for the network was set to 1, which returns connections between seed videos and recommended videos of the seed videos.III
Network analysis
Using the ‘video network’ data collected from the YouTube Data Tools, this study took a network-approach to look at the overall topology of coronavirus information on YouTube. The nodes in the network data represent videos, while edges represent connections between the videos linked by YouTube’s recommendation system.
Gephi - an open-source network visualization and analysis software was used to conduct the network analysis. The network was treated as an undirected network, since the purpose of this study was to detect the overall aggregation of videos in the network, adding directionality will make the analysis complicated (Airoldi, Beraldo, & Gandini, 2016). ‘ForceAtlas2’ algorithm (Jacomy et al., 2012)IV, which is a force-directed algorithm, was applied to look at the overall topography of the network. Also, basic network metrics including degree, density, and centrality measures were estimated using Gephi.
The most important part of this study was to identify clusters of videos within the network of coronavirus information in different cultural settings. In order to find the clusters, a modularity algorithm was implemented in Gephi. ‘Modularity’ is a measurement of the strength of division of a network into modules – also called groups, clusters, or communities (Newman, 2006)V. Networks with high modularity scores have dense connections between the nodes within the modules, but sparse connections between nodes in different modules. This algorithm is often used to detect community structures in networks.
Content analysis
With the results of the network analysis, this study further analyzed the content titles of clusters formed within the networks to observe how different types of contents are linked. As users typically encounter videos on health conditions through keyword searches on YouTube (Susarla, 2020), this study assumed that the frequently used words in video titles can help determine the overall theme of the clusters. Six coders, including the researchers of this study, were used in the study. Two coders were assigned to each country to code the subject matter of the clusters and the top sources that comprise the clusters.
First, subject matter for each cluster was analyzed. The clusters derived from social network analysis were characterized with Natural Language Processing (NLP) method, using the Natural Language Toolkit (NLTK) on Python. This method is useful in analyzing large volumes of textual data by finding structure in a highly unstructured data source. Top nouns from the titles of videos included in each cluster were extracted. The clusters were then labeled by researchers based on the nouns, and these clusters were again regrouped into eight content type categories based on a prior study as outlined in Table 1 (Marchal et al., 2020).
Secondly, based on the metrics from the network analysis, top 10% videos with the highest degree scores in each cluster were filtered out for further analysis. Source type for the filtered videos were categorized manually by researchers in reference to the source type category that was derived from a prior study, which descriptively observed the source and content types of YouTube videos concerning the coronavirus (Marchal et al., 2020).
Intercoder reliability tests were conducted for each country’s dataset by researchers in each country. As in the case of the U.S. and Korea datasets, the value of Cohen’s Kappa for content coding was 0.95, while source coding was 0.98, showing a high rate of reliability between the coders. For the India data set, there was a 0.94 match for content type and 1.0 match for source type. As for Mexico, Cohen’s Kappa for content coding was 1.0, while source coding was 0.90 for source.
Analysis of video metrics
Basic metrics such as ‘degree’, ‘view count’, and ‘comment count’ were analyzed by content type to observe what type of videos are likely to induce more engagement from users, and whether YouTube’s recommendation system responds to such user interaction. ‘Degree’ in network analysis is defined as the average number of edges per node in a network. In this study, degree can be defined as the number of recommendations a video would get in the network, which can be estimated on Gephi. The number of views and comments for each video are provided in the original data extracted from the YouTube Data Tool.
Research Results
The results are organized in the order of analysis conducted. First is the result of the overall network analysis, including the modularity algorithm that detects clusters inside the network. Second is the results of content and source type categorizations conducted on the clusters derived from the network analysis. The networks for each country with content type categories are presented to observe which type of contents are more strongly (or weakly) connected. Last, this study looks at the video metrics – including view count, like count, and comment count – for each cluster to observe differences in user engagement by content type.
Cluster detection
The number of nodes and edges in the data for each country differs as seen in Table 3. The U.S. data has the highest number of nodes, followed by Mexico, India, and South Korea. In contrast to the number of nodes in the network, the U.S. has the least number of edges, followed by Mexico, India, and South Korea. Modularity algorithm – a community detection algorithm used in Gephi – was applied to identify clusters of videos within the network. As a result, the U.S. video network had the highest number of clusters and the highest modularity score of 0.72. On the other hand, India has the least number of clusters and the lowest modularity score of 0.40. A high modularity score means the clusters in the network are very closely-knit internally, while weakly tied to other clusters. In contrast, lower scores mean that the clusters in the network are weakly tied to one another meaning users can move to different content types via YouTube’s video recommendation.
YouTube content and source analysis
Cluster content type
This study attempts to look at the content types of videos in the network. Top ten nouns were extracted from each cluster using the NLP method as an alternative to manual coding of the texts. Clusters were characterized and labeled by the researchers based on the top nouns that were extracted (cluster labels and top nouns are outlined in Appendix 1).
By categorizing clusters into six general content types as shown in Table 1, differences can be observed in the four countries. Table 4 shows the frequency of videos for each content type, and it is observed that factual/neutral videos are highly recommended in all four countries. Also, there aren’t any distinctive video clusters to include in the conspiratorial content type, and therefore the conspiratorial content category was excluded from further analysis. As for personal content type, it is noteworthy that India does not report the same level of recommendation to this information, unlike the U.S., South Korea, and Mexico, where in particular there is a tendency to turn to this platform for guidance, recommendations, and prevention. Another interesting area is that of economic and political information, where this type of information is quite attractive for countries such as the United States and India, while Mexico and South Korea show less information in this area.
Figure 1 is a visualization of the coronavirus information video network for each country. Each country has a different number of content types, and they are marked by different colors and labels.VI The networks clearly show different characteristics and patterns. The U.S. shows a dispersed network with many different clusters, while India and Korea show a close-knit network along with a few isolated clusters, and Mexico shows a bipolarized network. Overall, factual, investigative, and economic/political clusters are closely linked in all networks, while irrelevant or other health clusters are weakly linked as seen in the U.S. and Indian networks.
Table 6 shows the average degree for each content type. In the case of the U.S., other health (39.14) and factual/neutral (38.00) content clusters show high average degree followed by irrelevant (31.98), personal (27.84), and entertainment (23.92) clusters while economic/political content has the lowest average degree (21.36). Economic/political content cluster has the highest degree level in the case of Korea (61.57), followed by factual/neutral (60.83), investigative (46.16), personal (41.68), then entertainment (7.72). In India, factual/neutral has the highest degree level (56.79), followed by economic/political (51.47), investigative (50.52), then irrelevant (24.59).
Source Type
Top 10% sources with the highest degrees in each cluster were filtered out for each country, and the sources were categorized into five categories. The results in Table 7 show that the ratio of content from established media is the highest in India (45.4%), Korea (46.7%), and Mexico (48.5%). Established media outlets on YouTube seem to be the main source for factual/neutral content. The U.S. has the highest ratio of videos created by independent content creators (56.9%) followed by Korea (41.1%) and Mexico (37.2%), while India has the lowest ratio (21.3%). However, India had the highest ratio of videos published by public media compared to other countries (32.6%). Though the ratio of contents from professional health institutions is overall in the lower range, the ratio is relatively higher in Mexico compared to other countries (6.6%).
Engagement with different content types
Using degree, view count, like/dislike count, and comment count as metrics, this study attempted to look at the relation between recommendation of content types and user engagement. As seen in Table 8, overall average view count is the highest in the U.S., followed by India, Mexico, and Korea. Average view count for entertainment content is the highest in the U.S. with 9.3 million views. Usership for entertainment was also high for Korean users, followed by factual/neutral and personal content. In case of India, the metrics show that users consumed more factual/neutral content than any other categories. Investigative videos were also highly consumed by Indians with more than 1.8 million views as compared to the other three countries. Similarly, a trend of users viewing factual/neutral clusters can be seen in Mexico, with more than 1.6 million views.
Average like/dislike count and comment count parallels with the number of views – the U.S. with the highest average followed by India, Mexico, then Korea. Based on the ratio of engagement, users in all four countries preferred to engage with the content by clicking like/dislike rather than actually making comments on the videos. Though the ratio of comments is very low, the ratio was slightly higher for factual/neutral, investigative, and economic/political content.
Discussion and conclusion
Information production and consumption brought by new media platforms has led to a search for new contents by producers and consumers. This study attempted to observe the dynamics of production and consumption of information on YouTube during a time when the need for information online is high, and the main points of the research results can be outlined as follows.
First, though similar patterns of video networks were seen in the four countries, there were also some distinguishing differences. In all four countries, factual/neutral, investigative, economic/political, personal clusters are adjacent to one another in the network, while entertainment and irrelevant clusters are isolated from the network. However, the modularity scores show that in the case of the U.S., if a user starts their viewing with an entertainment content, then it becomes more difficult for them to leave the ‘rabbit hole’ of entertaining contents as the clusters are internally very closely-knit, while weakly tied to other clusters. In contrast, in the case of India, Korea and Mexico, the clusters in the networks were weakly tied to one another meaning users can move to different clusters through YouTube’s video recommendation.
Second, there was a difference in the types of sources recommended in each country. As for the U.S., Korea, and Mexico, established media outlets and independent content creators were the most recommended sources. While in the case of India, established media and public media were the most recommended by YouTube. The difference in source was especially noticeable for factual/neutral and economic/political content. Though contents from professional health agencies were expected to be highly recommended, analysis of the top 10% sources with the highest degrees show that established media and independent content creators are of high presence on YouTube compared to professional health agencies.
Lastly, the results of user engagement with content types are low in all countries. However, the results show an accelerating trend in the way users access YouTube to look for entertaining contents in the case of the U.S. Despite YouTube’s effort to recommend more factual and investigative content to users, the type of content that gets the most views and comments are entertainment videos, which was especially noticeable in the case of the U.S. Also, in the case of Korea, though the average degree of entertainment videos is lower than other content types, it received similar views. In general, there is a significant gap between the number of users and the comments, this can be explained because Mexicans are searching for information rather than to comment on what they look for. This is also clear by the number of reactions they do around the videos. As the table shows, amounts of reactions are fewer and do not correspond to the quantities of views. This may also be evidence from the data that many content creators used the coronavirus topic as clickbait to get more views and engagement, on further analysis these videos were found to be irrelevant.
The results showed that there is a significant consumption of mainstream media channels and institutional sources. During the pandemic, there was a tendency for YouTube users to favor contents disseminated by established media due to the reliability they represent and mainly due to a general awareness against conspiracy theories and an increasingly widespread work of these media in verifying information. In a recent study by the marketing analysis company ComScore, it was observed that mainstream companies recovered digital credibility during the pandemic, a place they had ceded to social networks and news portals (Vega, 2020). Thus, the reasons why YouTube streaming media channels regained ground is the possibility of doing much more in-depth and professional investigations, in addition to generating their own and original content, against which the user producer and distributor of their own content does not have the same competitive capacity in the dissemination of this content. In case of India, we observed that most of the YouTube content on the pandemic came from public and established media houses while the government bodies and health organizations made very little effort to disseminate information via YouTube. India was under a strict lockdown and hence we see content related to the economic and political implications of the coronavirus.
The results seen in the three countries contrasts with the case of the United States, where independent creators outnumber video broadcasting by established media. Parodies about the coronavirus are the type of content observed in a considerable number of the sample. Additionally, engagement with entertainment videos also stands out in the sample, considerably surpassing the rest of the countries in this regard, with more than nine million views of this type of content and also being the most commented or liked in the same. Also, an outstanding fact in the analysis of the content type clusters is that the United States is the only one of the four countries that reports a significant number of contents related to fact-checking. It should be recalled that during the initial months of the pandemic, the United States reported a significant circulation of videos related to conspiracy theories and fake news about the coronavirus.
Differences may result from differing policies towards YouTube in each country, and also from the differing response to the coronavirus by the government and public. As clusters can be considered as “crowd generated categories” and “crowd individual activity”, the clusters represent aggregated data on the different cultural interest and intentions that users may have (Airoldi et al., 2016). Given the different government measures taken during the first months of the situation, Asian countries like China and Korea tackled the pandemic early on and are still managing to recover quite effectively. India ranks 2nd in the number of coronavirus infections after the U.S. but has managed to keep the death rate lower than most Western countries, while in Mexico a fluctuation in the rate of coronavirus deaths has been seen, even surpassing India´s rate in earlier months of 2021. The surge in infection and death rates in countries like India and Mexico may have rendered users to seek factual/neutral content.
Though some of the results from the analysis seem meaningful, there are still limitations to this research. First of all, though the analysis was conducted to analyze the overall content of clusters in the video networks, labeling clusters based on frequently used nouns in video titles may have limits in terms of content analysis. Secondly, the unit of analysis for this study was YouTube videos since the current YouTube API does not provide information on users. This study was unable to further observe the characteristics and behavior of users who interact with YouTube content, and therefore the analysis and results of this study were left at a descriptive level. In order to achieve a more in-depth look into the relations between users and YouTube’s algorithmic functions, surveys or even logs should be used, which is a future goal for researching user behavior on YouTube. Though this research focused on a single platform, similar studies can be done with Facebook, Instagram, Twitter or any other social media platforms if technical barriers to data collection are to be overcome in the future. Such cross-platform analysis can also help researchers compare how widespread coronavirus information is on these platforms and how it is being received by the audiences in different countries across the world.
Notes
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
YouTube Press accessed September 26th, 2020, via https://www.youtube.com/about/press/
YouTube Data Tools (Rieder, 2015) can be accessed via https://tools.digitalmethods.net/netvizz/youtube/
Only 50 seed videos were initially collected due to technical issues that arose in the data collection process. Crawl depth is specified to define how far from the seeds the script should reach. When the crawl depth is set to 0, it will only retrieve relations between the seeds. The highest crawl depth is 2.
The ‘ForceAtlas2’ algorithm enables nodes in the network to repulse each other like magnets, while edges attract their nodes, like springs. These forces create a movement that converges to a balanced state.
For comparison, a modularity algorithm was applied with the same resolution of 1.0 for all four countries. Resolution lower than 1.0 will detect more smaller communities, while higher resolution will get less bigger communities.
As Gephi does not yet provide a method to use the same color palette for different networks, colors for content type are different in the four networks and were therefore labeled in text in the figure.