Skip to contents

This function allows to get an edgelist of the links of a certain webpage. The edge list also includes a column of source, that takes E if external page, and I if it is an internal page.

Usage

edgelist_of(
  target,
  target_title = F,
  full_URL = T,
  links_title = F,
  origin = F
)

Arguments

target

The target file you are targeting (Usually a web adress).

target_title

Logical value set by default as FALSE. If set to TRUE, it will get the edgelist with the title of the target instead of its url.

full_URL

A logical value indicating if you want to return the full URL adresses of the links.

links_title

Logical value set by default as FALSE. If set to TRUE, it will return the edgelist with the title of all links. If set to TRUE, it can take a long time for large datasets.

origin

A logical value indicating if you want to return the origin of the link (external/internal) in the final dataframe.

Value

Returns a dataframe object with the information of the edgelist of the connections of the targeted webpage.

Examples


target = "https://en.wikipedia.org/wiki/R_(programming_language)"
el <- edgelist_of(target)
#> Time for edgelist_of https://en.wikipedia.org/wiki/R_(programming_language): 0.352048635482788 seconds.
summary (el)
#>      FROM                TO           
#>  Length:1419        Length:1419       
#>  Class :character   Class :character  
#>  Mode  :character   Mode  :character