RePEc in April 2008

May 2nd, 2008 by Christian Zimmermann

As expected RePEc beat traffic records in the past months, with both EconPapers and IDEAS posting records. Over all services, LogEc recorded 749,918 file downloads and 2,815,159 abstract views.

In other news, the RePEc Input Service was inaugurated a few days ago. We also experienced some email trouble as our email server was subject to a denial-of-service (DoS) attack. While the attack is still going on, we have found efficient was to deal with it. Finally, the following institutions joined RePEc with new archives: Academy of Economic Studies (Bucharest), Academia Romana, University of Strathclyde, Indiana University-Purdue University Indianapolis, Wolters Kluwer Health, Robert Schumann Centre, Institute of Development Studies (Brighton), Lebanese Economic Association, Institute of Labour Law and Industrial Relations in the EC (IAAEG).

And now the various thresholds we passed last month, with plenty of important ones this time:
30,000,000 cumulated downloads
10,000,000 article downloads
700,000 monthly downloads
475,000 online items
175,000 paper abstracts
100,000 papers announced through NEP
16,000 registered authors
2,000 online chapters
1,600 software components

Contacting RePEc: update

April 28th, 2008 by Kit Baum

Following reconfiguration of RePEc’s email services, we now have reinstated the address repec at repec.org. Please use that address to contact the RePEc team.

RePEc Input Service

April 27th, 2008 by Christian Zimmermann

We inaugurate today the RePEc Input Service. As we detailed recently, bibliographic data is made available to RePEc on the publishers’ own ftp or web servers. Unfortunately, this is not possible in some cases, either because of web publishing policies or for technical reasons (in many cases forbidding the serving of files with an .rdf extension).

The RePEc Input Service is meant to be a service of last resort. Indeed, the principle of RePEc is that publishers are in charge of the maintenance of their own bibliographic metadata. This new service violates this in the sense that a RePEc volunteer has to maintain and host it. At this point, it can host only data about working paper series, but other document types are planned.

The scripts for the RePEc Input Service were written by Sune Karlsson. It is hosted by Christian Zimmermann at the University of Connecticut.

Volunteer recognition: Sune Karlsson

April 19th, 2008 by Christian Zimmermann

Sune Karlsson is currently Professor of Statistics at the Swedish Business School of Örebro University. He has been involved with RePEc, as a co-founder, right from the start and is an essential part of the RePEc team, providing a large numbers of services and great expertise.

While at the Stockholm School of Economics, Sune inaugurated in 1997 S-WoPEc, the Swedish (now Scandinavian) Working Papers in Economics site. S-WoPEc was one of the founding archives of RePEc in June 1997. In 1998, he then created working paper site of the European Business Schools Librarians’ Group.

In May 2001, Sune created LogEc, which compiles usage statistics for the various RePEc services and displays them. Two months later, he added EconPapers to his portfolio, now the second most popular service displaying the data collected by RePEc.

Sune does also a lot of behind-the-scene work: a syntax checker for RePEc archive maintainers, which includes a URL checker also used for the NEP project (for which he also provided the first implementation script). He runs also the scripts that allow to recognize the different versions of the same work. Finally, he is the editor of the NEP report on Econometrics.

Without Sune’s many initiatives and his master programming skills, RePEc would not be at the point it is today.

RePEc sponsors

April 13th, 2008 by Christian Zimmermann

Given that RePEc has no revenue, it relies on the goodwill of volunteers to run. But this work is not possible without support from some sponsors that are will share some resources for this good cause. Here is an attempt to acknowledge these sponsors.

Past sponsors included: Hitotsubashi University, Joint Information Systems Committee (JISC), University of Manchester Computing Centre (MIDAS), Open Society Institute, Université du Québec à Montréal, Stockholm School of Economics.

Cheating and RePEc

April 6th, 2008 by Christian Zimmermann

This posts details how RePEc can be and has been useful in detecting cheating, and how RePEc is dealing with this unfortunate phenomenon.

Plagiarism by authors

RePEc facilitates the availability of research and thus makes it available to would be plagiarists, but RePEc also facilitates the detection of such plagiarism, either directly through RePEc services like EconPapers and IDEAS, or indirectly as other services like Google or Yahoo use RePEc to populate their search engines. While it is not part of its mission, RePEc has on occasion been assisting plagiarized authors to obtain redress, resulting in at least one dismissal from graduate school among the several caught authors.

Plagiarism by publishers

Yes, publishers can also plagiarize, namely by publishing without authorization from authors. Quite obviously, RePEc is tailor-made for detecting such abuse. Unfortunately, there is little that RePEc can do to punish such publishers, except unlisting them. This has happened so far for one publisher, and another one is currently on probation.

Manipulating author profiles

Given that RePEc provides author rankings, there are incentive to inflate one’s résumé with works of others. The logs of the RePEc Author Service are monitored on a regular basis. Any inappropriate claim is then flagged, and the misbehaving author may face a warning or even an exclusion, depending on circumstances. Honest mistakes may happen, but willful manipulation is not tolerated.

Manipulating statistics

Another way to improve rankings is to inflate download ans abstract views statistics. Fortunately, LogEc uses a series of filters, among others removing multiple downloads from the same IP address clusters, looks for various suspicious download patterns as well as a visual audit. Suspicious activity leads usually to the reset the relevant statistics to zero with a warning, an exclusion being the the consequence for a repeat offender. So far, one author has been excluded and several warned.

RePEc in March 2008

April 1st, 2008 by Christian Zimmermann

March is typically a month where all traffic records are beaten on RePEc. Well not this month, but we were close: 694,988 file downloads (less than 3,000 short of the record) and 2,675,511 abstract views (record). The fact that Easter fell in March this year probably has something to do with this. We can thus look forward to a glorious month of April! But we should not be too disappointed, as there is now a RePEc application on Facebook. Look for a big blue letter “R” in the applications menu of your Facebook account.

An uncharacteristically low number of new RePEc archives opened last month, four: Princeton University Press, the Department of Economics at University of Auckland, the Department of University of Malaga, and the Institute of Local Public Finance, Germany. As for the thresholds passed during the last month, we have:

6,000,000 cumulated downloads through EconPapers

3,000,000 references extracted

1,250,000 citations found

175,000 online working papers

140,000 items with references

100,000 cited articles

2,000 listed book chapters

How data is assembled in RePEc

March 30th, 2008 by Christian Zimmermann

RePEc is essentially a large bibliographic database. Thus it needs data about bibliographic items. As RePEc has no employee and can only rely on volunteers, it had to find a way to reduce the cost of data input to a minimum. It succeeded in the sense that this cost is shifted to those that benefit the most from having their publications listed on RePEc: the publishers. We call such publishers “RePEc archive maintainers.” They can be commercial publishers, university presses, economics departments, research centers, central banks, societies or other organizations that have some form of publication relevant to Economics.

This is how RePEc archive maintainers proceed: They maintain sets of flat text files following a particular format called ReDIF. There are different formats for different types of documents. For example, the template describing a working paper would look like this:

Template-Type: ReDIF-Paper 1.0

Author-Name: Hildegrund Muesli

Author-Workplace-Name: University of Upper Elbonia

Author-Name: Adalbrecht Vollkorn

Author-Workplace-Name: Institute for Grandiose Research

Title: The Economics of Gizmos: Grandiose Results

Abstract: Gizmos have become more common with the advent of cybermarkets. This paper explains how banking regulation, demographics and global climate change have increased the demand for gizmos.

Classification-JEL: Z00

Keywords: Location, Location, Location

File-URL: http://www.uelbonia.edu/econ/papers/0803.pdf

Number: 0803

Creation-Date: 2008-02

Handle: RePEc:uel:papers:0803

There are other templates for articles, chapters, books, software components, series and archives. For RePEc-internal uses, people and institutions also have templates, all with unique identifiers (handles) that allow for cross-linking. These templates are then placed on the website or the anonymous ftp site of the publisher, and RePEc services visit them on a regular basis, typically daily, to check for changes. This allows for very fast turn-around times.

Complete instructions on how to proceed to open a RePEc archive can be found here. If your institution is not yet listed among the about 900 participating archives, consider following these instructions.

The Budapest Open Access Initiative

March 22nd, 2008 by Christian Zimmermann

The Budapest Open Access Initiative (BOAI) was signed on 14 February 2002. Its goal is to encourage an international effort to make research in all academic fields freely available on the internet. It defines open access as “the world-wide electronic distribution of the peer-reviewed journal literature and completely free and unrestricted access to it by all scientists, scholars, teachers, students, and other curious minds.” This definition is not limited to articles published in journals but also encompasses pre-prints (discussion or workings papers as we call them in Economics).

The Directory of Open Access Journals (DOAJ) list (as of the writing of this post) 3289 journals, including 68 in Economics, that satisfy the requirements of BOAI. While Economics is relatively underrepresented, the working paper culture in our field allows to find in open access many, if not most, of the articles published in non-open access journals (RePEc tries very hard to identify links between different versions of the same work). In fact, most publishers explicitly allow authors to publish pre-prints or post-prints of their articles in institutional repositories, including working papers series. A good list of policies by publishers can be found at RoMEO.

If you think this is a good initiative, you can sign the BOAI here. Foremost, make sure your publications are available in free access through a working papers series, or absent this option, through the Munich Personal RePEc Archive. In particular, in most cases, authors should not remove their working papers once their are published in journals.

Classifying authors

March 16th, 2008 by Christian Zimmermann

A difficult task librarians often face in the classification of items is determining whether authors with similar names are the same person. Indeed, bibliographic records are most of the time very limited in author identification. Take the case of Adam Smith. He may be listed under his full name, which is by no means unique, or worse only as A. Smith, which is easily confused with others. Librarians then rely on context and additional information gathered outside of the bibliographic record to attribute the work to the right person, hopefully without error.

With the large numbers of works now available, such laborious categorization becomes unfeasible, and automatic classification makes numerous errors. Within RePEc, we rely on the authors themselves to perform the classification. When they register in the RePEc Author Service, they have the opportunity to enter all the possible name variations in they may be listed in a bibliographic record. For John Maynard Keynes (who is not registered), such name variations could be:

John Maynard Keynes
John M. Keynes
John Keynes
J. M. Keynes
J. Keynes
Keynes, John Maynard
Keynes, John M.
Keynes, John
Keynes, J. M.
Keynes, J.

In addition, an author may have changed names (through marriage), be listed with a title (Prof., Sir) or with a suffix (Jr, Sr, III). Variations multiply if names have accents, which some publishers do not take into account or encode in the wrong character set. The possibilities are numerous. The registered author is then offered first suggestions of works that match the name variations and then suggestions that offer some close match to name variations (typographical errors happen). The author can then accept these works or reject them.

The RePEc Author Service has so far managed to collect data from close to 16,000 authors who have claimed over 300,000 works as theirs. Such data is in particular used to increase the accuracy of various rankings. And within this set of authors, there is already a large number of homonyms, even when one looks beyond the initial of the first name, which is the precision that some other services have.

If you know of other homonyms in the profession, encourage them to register!

The RePEc budget for 2008

March 9th, 2008 by Christian Zimmermann
  Budget 2007 Effective 2007 Budget 2008
Expenses US$0.00 US$0.00 US$0.00
Revenues US$0.00 US$0.00 US$0.00

Thanks to all our volunteers!

RePEc in February 2008

March 2nd, 2008 by Christian Zimmermann

Every month, a short summary of what happened with RePEc is sent to the RePEc-announce mailing list. I also put that message, slightly adapted, on this blog.

During this month, IDEAS moved to a new server sponsored by the Society for Economic Dynamics. It continues to be hosted by the University of Connecticut and is now located on a faster line to the Internet.

In terms of traffic, 613,984 file downloads and 2,246,241 abstract views were recorded within the month, once more significantly up from a year ago. This leads us to the thresholds we have passed this month:

40,000,000 cumulative article abstract views on all RePEc services
25,000,000 cumulative abstract views on EconPapers
300,000 items claimed by registered authors
100,000 JEL codes papers
20,000 unique subscribers in NEP
2,800 journals and series

Volunteer recognition: Thomas Krichel

February 21st, 2008 by Christian Zimmermann

Thomas Krichel is not just a RePEc volunteer, he is RePEc. In 1991, as an research assistant at the Economic Department of Loughborough University, he saw the potential that the Internet gave for the dissemination of research in Economics, but could not manage to get a hold on good data about new working papers. In February 1993, on a lectureship at the University of Surrey, he was more lucky and teamed with Féthy Mili, Economics librarian at the Université de Montréal, who contributed data on 250 series, and Hans Amman (University of Amsterdam), who let Thomas use his coryfee mailing list. Bob Parks soon joined with his Economics Working Paper Archive at Washington University. Thus the NetEc project was launched. It moved to a gopher server at the Manchester Computing Centre in 1993, and then to the web. That year, Thomas also got help in collecting data from José Manuel Barrueco Cruz, Economics librarian at the University of Valencia. But soon they realized that there was too much information out on the Internet for just the two of them to collect.

This is when Thomas suggested the creation of RePEc which would completely decentralize the data input: the publishers, who benefit the most from having their papers listed on web indexes, were to index the works themselves. With the collaboration of Sune Karlsson (SWoPEc, Stockholm School of Economics), Bob Parks and Corry Stuyts (DEGREE, Netherlands), José and Thomas then launched RePEc in June 1997. It still works under the same principles, with great success.

Thomas is still the heart and soul of RePEc. He has his hand in almost every project that is undertaken. After completing his Economics PhD at the University of Surrey, he moved to Long Island University to take a position of assistant professor in … Library Studies. Now tenured, he is an eminence grise in the online provision of bibliographic data and is pushing the RePEc concept into other fields. Within RePEc, most of his attention is currently directed towards NEP, the email notification service on new working papers.

World Ranking of Repositories, RePEc is #2

February 14th, 2008 by Christian Zimmermann

The Webometrics Ranking of World Universities is an initiative that tries to establish which universities provides to most content on the web and get visibility from it. The ranking of universities is based on the size of the web domain (20%), the number of rich files available (PDF, RTF, etc., 15%), Research on Google Scholar (15%), and link visibility (50%). Not surprisingly, US universities monopolize the 24 first spots, led by MIT.

Webometrics also ranks repositories, the criteria being the same as for universities. The ranking is led by Arxiv, the grand-daddy of all repositories covering much of Physics and Mathematics. RePEc is number 2, followed by E-LIS, a repository in Library Sciences founded by Thomas Krichel, who is also at the origin of RePEc!

Other notables down the list: HAL, a French repository that feeds to RePEc at number 9, CDLIB, the University of California Repository, a RePEc participant at number 19, SSRN, not in RePEc, at number 37, the Munich Personal RePEc Archive, barely a year old, is already number 56, and AgEconSearch, not in RePEc, is ranked number 126.

Society for Economic Dynamics sponsors new server for IDEAS

February 7th, 2008 by Christian Zimmermann

IDEAS just moved to a new server sponsored by the Society for Economic Dynamics. The old server, which was sponsored by the College of Liberal Arts and Sciences at the University of Connecticut had been running almost flawlessly since October 2002, but was starting to get overwhelmed by the amount of material now in RePEc and by the heavy traffic and number crunching it entails. While the amount of material more than tripled, the complexity of the data increased much more than that, given the links with authors, references, citations, JEL codes, NEP reports, rankings, institutions, publication compilations, and reading lists.

The new server has more computational power, more memory and especially more disk space. As before, it is hosting IDEAS, EDIRC and QM&RBC. It also hosts the website of the Society of Economic Dynamics, which is willing to sponsor it as it was looking for space to host the datasets and program codes used for articles published in the Review of Economic Dynamics. The server is also set up to provide limited emergency support in case another RePEc service is failing. The hosting continues to be provided by the College of Liberal Arts and Sciences at the University of Connecticut. In particular, Tim Ruggieri from the CLAS Computer Support Group helped with the configuration of the server.

RePEc relies entirely on the support of volunteers in its operations. Contact us if you want to help in one way or the other.