Automatically Assessing the Quality of Wikipedia Articles

Since its inception in 2001, Wikipedia has fast become one of the Internet’s most dominant sources of information. Dubbed “the free encyclopedia”, Wikipedia contains millions of articles that are written, edited, and maintained by volunteers. Due in part to the open, collaborative process by which content is generated, many have questioned the reliability of these articles. The high variance in quality between articles is a potential source of confusion that likely leaves many visitors unable to distinguish between good articles and bad. In this work, we describe how a very simple metric – word count – can be used to as a proxy for article quality, and discuss the implications of this result for Wikipedia in particular, and quality assessment in general.