TY - JOUR
T1 - How much information do search engines disclose on the links to a web page? A longitudinal case study of the 'cybermetrics' home page
AU - Bar-Ilan, Judit
PY - 2002
Y1 - 2002
N2 - This study presents the results of an extensive search for links to the home page of the e-journal Cybermetrics. The results show that the search engines do not retrieve all the link pages that are indexed by them. In the specific case, the search engine Google concealed between 48 and 70% of the links to the page each time it was queried, and HotBot concealed between 20 and 39% of the link pages indexed by it. The queries were repeated four times during a one-year period, between January 2001 and January 2002 in order to rule out the possibility of an accidental finding. The other search engines examined also concealed some pages but to a much smaller extent. The findings raise questions about the use of WIF (the Web Impact Factor) as a scientometric indicator based on data retrieved from commercial search engines. The content of the retrieved and concealed pages was characterized using the method of content analysis. The characterization shows that the set of initially retrieved pages, and the set of initially retrieved pages plus the set of concealed pages, are significantly different for Google.
AB - This study presents the results of an extensive search for links to the home page of the e-journal Cybermetrics. The results show that the search engines do not retrieve all the link pages that are indexed by them. In the specific case, the search engine Google concealed between 48 and 70% of the links to the page each time it was queried, and HotBot concealed between 20 and 39% of the link pages indexed by it. The queries were repeated four times during a one-year period, between January 2001 and January 2002 in order to rule out the possibility of an accidental finding. The other search engines examined also concealed some pages but to a much smaller extent. The findings raise questions about the use of WIF (the Web Impact Factor) as a scientometric indicator based on data retrieved from commercial search engines. The content of the retrieved and concealed pages was characterized using the method of content analysis. The characterization shows that the set of initially retrieved pages, and the set of initially retrieved pages plus the set of concealed pages, are significantly different for Google.
UR - http://www.scopus.com/inward/record.url?scp=0036944460&partnerID=8YFLogxK
U2 - 10.1177/016555150202800602
DO - 10.1177/016555150202800602
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:0036944460
SN - 0165-5515
VL - 28
SP - 455
EP - 466
JO - Journal of Information Science
JF - Journal of Information Science
IS - 6
ER -