Caselaw is Set Free, What Next?

Monday, October 20, 2014 | 2:41 PM

The next article in our 10th Anniversary Series is by Thomas Bruce. He is the director of the Legal Information Institute at Cornell. He co-founded the LII in 1992. Today, its legal collections are used widely and have inspired the Free Access to Law Movement which has helped citizens worldwide learn about the laws that govern them. Thomas is also the author of Cello, first Web browser for Microsoft Windows. -- Anurag Acharya


Caselaw is Set Free, What Next?

Thomas Bruce, Director, Legal Information Institute, Cornell

A lawyer story


Google Scholar’s caselaw collection is a victory for open access to legal information and the democratization of law. It would be more than worthy of celebration from that standpoint alone. But caselaw is above all an obsession of lawyers, and I’d like to start by telling the tale from their point of view.

Five years ago, when Google Scholar added judicial opinions to its portfolio, it created an immediate sensation among lawyers. Small-office and solo practitioners were the most vocal about it; they had always had a difficult time affording the services of commercial publishers, even in print. And now there was access to a significant chunk of material that had previously been lodged firmly behind paywalls. It was linked and searchable, and still better, it offered a version of the citation-tracking and evaluation features that lawyers knew and loved in expensive commercial systems. It had first-class sorting and filtering features. It had Bluebook-form citations for each case (pretty much the epitome of something that nobody but lawyers knows or cares about, but a very thoughtful touch indeed). Nobody in the open-access arena had tried such a thing, and probably only Google could have. One commentator said that, “Google fired (arguably) the loudest...salvo in the battle for free access to caselaw… and it apparently came as a tweet”.

Scholar’s immediate impact on the legal profession was owed in large part to its technical virtuosity. It was an unusual display of ingenuity used to democratize services and features whose value had mostly been known only to lawyers. But, for the legal profession, it was happening in the middle of a long-brewing, near-perfect storm. Since at least the early 90’s, clients had complained about surcharges that law firms added to legal research costs. By 2000, there was growing refusal to reimburse legal-research fees at all; clients felt that the firm’s online charges were just a part of overhead, like water and electricity. That was not an isolated gripe; rather, it was a visible crack in a business model that we now know had been eroding for quite some time. By one estimate, the 2008 implosion of the financial-services industry destroyed over a third of the legal employment in New York. A lot of firms changed radically or disappeared altogether in the aftermath. You could talk, in dry academic terms, about downward price pressure on the industry. One suspects that the feeling was more like riding in an elevator whose cables had been cut.

There had been free offerings of caselaw online for some time, starting with a BBS system offered by the Cleveland Freenet in 1989; the first web-based effort started here at Cornell in 1992, and was followed with a full edition of all Federal statutes in 1994. Elsewhere -- notably in Canada and Australia -- open-access systems offered by third parties had evolved into the de facto national standard. And government was catching up, with many law creators publishing their materials online, for free.

Free services had never been the first choice of lawyers in the US. Some of the reasons were rational -- free services often lacked features that lawyers depend on, most provided very little in the way of commentary or annotation, and in any case they were highly distributed. There was no “one-stop shopping” in the world of open access to law, just a lot of websites offering different collections. The irrational reasons were, if anything, even more interesting and far more influential, though much more deeply buried in lawyer psyches. Lawyers are notoriously conservative in their work methods, and many law librarians even more so. Anything that was both new and noncommercial was inherently suspect. And the commercial services had had more than a century to reinforce the idea that size and comprehensiveness were the only measures of quality that mattered.

Even so, it’s hard to convey the degree to which lawyers mistrust distributed systems. As John Lederer once remarked, “Lawyers don’t buy books -- they buy systems of books”, and so it was with electronic products as well. It was easy for lawyers to dismiss what they saw as isolated pockets of legal information offered by volunteers at wildly different levels of added value, and marketers of commercial services had been quick to emphasize these qualities. That said, in the year prior to the addition of caselaw to Scholar, Cornell’s website had delivered well over 81 million pageviews to nearly 14 million unique visitors. 4.5 million of those pageviews went to the Federal Rules of Civil Procedure, a collection unlikely to be used by anyone but lawyers.

Comes now Google, a company with unparalleled capacity and legendary technical skills, offering a large collection of caselaw under one roof, with a workable citator and advanced search functionality. That was a big story, and it was often reported as “Google takes on commercial legal-research behemoths”. It was free access offered from a source that could not be dismissed as somehow beneath notice or unlikely to survive. Google’s offerings in Scholar thus became a validation of, and a capstone on, the things that open-access advocates had been doing for years. Apart from its inherent value -- which was, and is, huge -- it was a sign that freely accessible legal information was technically advanced and more than sufficient for many if not most professional needs. Most of all, it signaled that free legal information was something to be taken seriously. It sent that signal at a time when circumstances compelled the profession to pay far more attention than it otherwise might have. Scholar not only brought us a new and capable collection, it brought a new level and quality of attention to the entire open-access enterprise.

Everyone else


I began by telling a story about law and lawyers, but of course there’s an even more compelling story about law and everyone else. Laws -- and particularly statutes and regulations -- affect everybody. They describe what’s possible and permissible, what it costs to do business, what we can expect from government and what government can expect from us. On any given day, an open-access legal web site such as ours, or Scholar, is used by people who are helping veterans get the benefits to which they’re entitled, small businesses planning new courses of action, and students at all levels who are learning about the Constitution and our system of government. There are law-enforcement personnel learning about the limits and obligations of their position, hospital managers consulting public-benefits law, and people finding out what they have to do to sell new products in new markets. Those people need access to law. They need to be able to create starting points for themselves, using search to connect words and phrases that they already understand with concepts and explanations that at first they will not understand at all. They need to be able to follow their noses from those poorly-understood things to other pages that will explain them. Making all that possible is the next challenge.

What now?


Google Scholar’s caselaw collection offers features -- such as citators -- that are a step toward the “system of books” that would fully integrate primary legal sources and commentary into a practical resource for public understanding and professional practice. The legal-information ecosystem on the Web as a whole is moving in that direction. As that progresses, the benefits to everyone affected by law -- which is to say, everyone, period -- will be enormous. We will move beyond making law available on the Web to making it truly accessible on the Web -- not just discoverable, but understandable.

In 1992, starting with important caselaw collections, the open-access community began connecting law to itself. Hyperlinks gave readers a way to seamlessly follow citations -- at least if the cited thing was available online somewhere. And simply seeing to it that the things that ought to be online are online kept us all busy for a very long time (and is still a significant problem, in many places, some of them surprisingly close to home). We need to increase the density of connections between documents by making connections easier for machines (rather than human authors) to create. We need to hugely increase the amount of freely-available material that explains the law. And we need to -- in ways both trivial, and not -- make it possible for people to find the laws that affect them using things they already know.

Regulations provide a really good arena for thinking about such problems, for two reasons. First, they are harder for information systems to deal with. They are inconsistently drafted by a wide variety of people. For example, the Code of Federal Regulations is essentially a compilation of the work of perhaps 200 agencies (nobody really knows exactly how many). And, compared to caselaw, regulations have been relatively neglected by open-access publishers. Finally, and most importantly, they are the largest single contact surface between the public and the legal system. Yes, there are Supreme Court cases that are sweeping in their effect on daily life -- roughly half a dozen a year, compared to the thousands and thousands of cases in the Federal system that are just about two people suing two other people over something that only four people care about (and maybe a fifth if you count the judge). Regulations affect lots of people, and they change often. That makes them much more of a challenge for open-access publishers, both technically and economically. It also makes it that much more urgent to provide citizens with improved modes of access and value-added services such as notification of changes and anything and everything that would make compliance easier. Second, regulations are about things, and they are often based on science. And building things that bridge knowledge domains is what information scientists do.

A trivial example may help. Right now, a full-text search for “tylenol” in the US Code of Federal Regulations will find… nothing. Mind you, Tylenol is regulated, but it’s regulated as “acetaminophen”. But if we link up the data here in Cornell’s CFR collection with data in the DrugBank pharmaceutical collection , we can automatically determine that the user needs to know about acetaminophen -- and we can do that with any name-brand drug in which acetaminophen is a component. By classifying regulations using the same system that science librarians use to organize papers in agriculture, we can determine which scientific papers may form the rationale for particular regulations, and link the regulations to the papers that explain the underlying science. These techniques, informed by emerging approaches in natural-language processing and the Semantic Web, hold great promise.

All successful information-seeking processes permit the searcher to exchange something she already knows for something she wants to know. By using technology to vastly expand the number of things that can meaningfully and precisely be submitted for search, we can dramatically improve results for a wide swath of users. In our shop, we refer to this as the process of “getting from barking dog to nuisance”, an in-joke that centers around mapping a problem expressed in real-world terms to a legal concept. Making those mappings on a wide scale is a great challenge. If we had those mappings, we could answer a lot of everyday questions for a lot of people.

As I hinted earlier, search is often just the start; it shows the way to the trailhead, but the information-seeker must then follow a path that leads to commentary and deeper explanation of what the search engine offers easily. Building that path is a problem that rests critically on integration across multiple websites and collections. Metadata-publishing standards and linked-data approaches are helping; we look forward, for example, to a set of specific legal extensions to schema.org that will make it easier for people and machines to follow their noses from what search provides to the understanding that they really need. It will be a long job.

But that is a tale for another day, perhaps another ten years in the future. It’s exciting to see how far we’ve come. Scholar, and its legal collection, are a tremendous gift to those who want to know about the law, and a platform for those of us who want to go further.

Rise of the Rest: The Growing Impact of Non-Elite Journals

Wednesday, October 8, 2014 | 5:30 PM

The world of scholarly communication has changed quite a bit over the last decade and Scholar has been a part of the change. We are taking the opportunity of Scholar's 10th anniversary to explore the impact of these changes - looking at how scholarship and citation patterns have changed as publications and archives moved online and comprehensive relevance-ranked search became available to everyone.

As the next article in the 10th anniversary series, we have published a study examining the evolution of the impact of non-elite journals on arXiv. The idea that a small elite set of journals covers most of the key papers in a discipline has long been prevalent in the study of scholarly communication. We explore how this has changed over 1995-2013. - Anurag Acharya


Rise of the Rest: The Growing Impact of Non-Elite Journals


Anurag Acharya, Alex Verstak, Helder Suzuki, Sean Henderson,
Mikhail Iakhiaev, Cliff Chiung Yu Lin,  Namit Shetty

In this paper, we examine the evolution of the impact of non-elite journals. We attempt to answer two questions. First, what fraction of the top-cited articles are published in non-elite journals and how has this changed over time. Second, what fraction of the total citations are to non-elite journals and how has this changed over time.

To answer these questions, we studied citations to articles published in 1995-2013. We computed the 10 most-cited journals and the 1000 most-cited articles each year for all the 261 subject categories included in Scholar Metrics. We considered the 10 most-cited journals in a category as the elite journals for the category and all other journals in the category as non-elite.

There are two main conclusions from our study. First, the fraction of highly-cited articles published in non-elite journals increased steadily over 1995-2013. While the elite journals still publish a substantial fraction of high-impact articles, many more authors of well-regarded papers in a diverse array of research fields are choosing other venues.

Our analysis indicates that the number of top-1000 papers published in non-elite journals for the representative subject category went from 149 in 1995 to 245 in 2013, a growth of 64%. Looking at broad research areas, 4 out of 9 broad areas saw at least one-third of the top-cited articles published in non-elite journals in 2013. All broad areas of research saw a growth in the fraction of top-cited articles published in non-elite journals over 1995-2013. For 6 out of 9 broad areas, the fraction of top-cited papers published in non-elite journals for the representative subject category grew by 45% or more.

Second, now that finding and reading relevant articles in non-elite journals is about as easy as finding and reading articles in elite journals, researchers are increasingly building on and citing work published everywhere. Considering citations to all articles, the percentage of citations to articles in non-elite journals went from 27% of all citations in 1995 to 47% in 2013. Six out of nine broad areas had at least 50% of total citations going to articles published in non-elite journals in 2013.

Read on arXiv

10th Anniversary Series: SciELO, Google Scholar and Latin American Journals

Monday, September 29, 2014 | 12:24 PM

The second article in our 10th Anniversary Series is by Abel Packer. He is the director of the SciELO Program which has transformed scholarly publishing in Latin America. Given SciELO's multi-lingual reach, this post appears in English, Portuguese and Spanish. - Anurag Acharya.


SciELO, Google Scholar and Latin American Journals

Abel L. Packer
SciELO/FAPESP Program, Director

SciELO is 16 years old. Today, it publishes approximately one thousand selected peer-reviewed open access journals grouped into national collections. The SciELO Network currently comprises 16 national collections, 13 from Latin America and three from Portugal, Spain and South Africa.

The primary goal of SciELO is to provide growing visibility to the research published by national journals. When SciELO was launched these journals were print-only, usually with a small subscriber base. Only a few journals were indexed in citation indexes and there was no way of determining the real or potential impact that most of the journals had in their respective fields.

Today, we estimate about one million downloads a day across the network, 500 thousand of them from SciELO Brazil as based on COUNTER-compliant statistics. The total number of articles hosted across the SciELO Network is over 450 thousand.

Two important questions: how did SciELO succeed in setting up such a broad-based operation and achieve such an impressive performance in terms of downloads and why have so many countries and journals joined the SciELO Network?

There are four major factors. First, the reputation and leadership of the driving organizations. The SciELO project was established and nurtured by the São Paulo Research Foundation (FAPESP), widely known in Brazil as the most efficient and advanced research agency in the country, and the Latin American and Caribbean Center in Health Sciences Information (BIREME), which is affiliated with the Pan American Health Organization and the WHO. The initial motivation for the partnership was to develop a citation index covering a more comprehensive collection of journals beyond the 17 which were then indexed in the Journal Citation Reports from ISI. Soon after launch, the Chilean National Commission for Scientific and Technological Research (CONICYT) joined the effort. From 2002 on, the Brazilian National Council for Scientific and Technological Development (CNPq) and other national research agencies started also to support SciELO.

Second, the selective acceptance criteria applied to journals for SciELO collections. Only open access peer-reviewed journals with an editorial board composed of well-known researchers, a reasonable rejection rate of manuscripts and standards-compliant publication processes were accepted. The best journals of Brazil were invited by FAPESP to join SciELO. CONICYT took a similar approach for SciELO Chile. This helped set the expectation of selective acceptance criteria for new national collections.

Third, the tremendous impact of Google Scholar which was decisive in moving the program ahead. As soon as Google Scholar began indexing SciELO, the traffic to SciELO sites increased to such an extraordinary extent that the general panorama was completely changed. The dramatic growth contributed, in a major way, to overcome the resistance that publishers had towards online publication. Google Scholar showed publishers, editors, authors and users that online publication was the new paradigm for the dissemination of journals and that SciELO could help them to achieve it. The processes put in place by SciELO to create structured versions of the articles and metadata, as well as standardization of article formatting were a key component in the rapid success of the indexing effort.

Fourth, the success and the increasing use of SciELO, together with quality control on the journals, led national research evaluation systems to include SciELO as an index in their evaluation criteria. This favored an increase in manuscript submissions to indexed journals, which provided an additional impetus to the program.

The other ongoing objective of SciELO is to increase the impact of the research communicated by its journals. A key requirement for this is to identify and count citations to SciELO journals and articles. SciELO computes bibliometric indicators covering the journals it hosts. To measure broader impact, SciELO initially relied on Web of Science (WoS) and Scopus. However, these indexes have an incomplete coverage of SciELO journals. For example, in 2014 Scopus covers 70% of the journals in SciELO Brazil, and WoS only 36%. To partially solve this lack in coverage, SciELO concluded an agreement with Thomson Reuters to include, as of 2014, the SciELO Citation Index in the WoS platform which provides a wider coverage, particularly in the physical and life sciences.

However, Google Scholar has much broader coverage worldwide, even more so in social sciences and humanities. As a result, Scholar Metrics offers more comprehensive citation numbers. These are now used by SciELO to evaluate the broader influence of its journals. Scholar Metrics are also a key part of the evaluation process for new journals that want to become part of SciELO. In this regard, what we would really like in Scholar Metrics is the availability of an annual series of indicators, and extending the journal rankings beyond 100.

SciELO and Google Scholar have walked a long way together. Together, we have helped to significantly increase the worldwide visibility of Latin American journals and journals of Portugal, Spain and South Africa. On its anniversary, we would like to congratulate the Scholar team for the impressive development of Google Scholar, a comprehensive search service that many gifted minds once only dreamed of. Long live Google Scholar!

10 º aniversário: SciELO, Google Scholar e os periódicos da América Latina

| 12:24 PM

The second article in our 10th Anniversary Series is by Abel Packer. He is the director of the SciELO Program which has transformed scholarly publishing in Latin America. Given SciELO's multi-lingual reach, this post appears in English, Portuguese and Spanish. - Anurag Acharya.


SciELO, Google Scholar e os periódicos da América Latina

Abel L. Packer
Programa SciELO/FAPESP, Diretor

O SciELO tem 16 anos. Hoje, publica aproximadamente mil periódicos selecionados, revisados por pares, de acesso aberto, agrupados em coleções nacionais. A Rede SciELO é formada atualmente por 16 coleções nacionais, 13 da América Latina, e também Portugal, Espanha e África do Sul.

A função primordial do SciELO é prover visibilidade crescente e sustentável à pesquisa comunicada por periódicos publicados nacionalmente. Quando o SciELO foi lançado, estes periódicos existiam apenas na versão impressa, geralmente com uma pequena base de assinantes. Poucos periódicos eram indexados em índices de citações e não havia forma de determinar o impacto real ou potencial que muitos dos periódicos tinham em suas respectivas áreas.

Hoje, estimamos em aproximadamente um milhão de downloads diários através da rede, 500 mil deles do SciELO Brasil de acordo com estatísticas compatíveis com a iniciativa COUNTER. O número total de artigos hospedados através da rede SciELO ultrapassa 450 mil.

Como o SciELO obteve tal sucesso na criação de uma operação tão ampla e conseguiu um desempenho tão impressionante em termos de downloads? Por que tantos países e periódicos aderiram à rede SciELO?

Existem quatro fatores principais. O primeiro, a reputação e liderança das organizações dirigentes. O projeto SciELO foi estabelecido e desenvolvido pela Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP), amplamente reconhecida no Brasil como a mais eficiente e avançada agencia de fomento no país, e o Centro Latino-Americano e do Caribe de Informação em Ciências da Saúde (BIREME), que é afiliado à Organização Pan-Americana da Saúde e à OMS. A motivação inicial da parceria foi desenvolver um índice de citações abarcando uma coleção de periódicos mais abrangente além dos 17 que estavam indexados no Journal Citation Reports do ISI. Logo após o lançamento, a Comissão Nacional de Pesquisa Científica e Tecnológica do Chile (CONICYT) aderiu à iniciativa. A partir de 2002, o Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) e outras agências nacionais de pesquisa também passaram a apoiar o SciELO.

Segundo, os critérios de aceitação seletivos aplicados aos periódicos das coleções SciELO. Apenas periódicos de acesso aberto e revisados por pares com corpo editorial formado por pesquisadores renomados, índice de rejeição de manuscritos razoável e processo de publicação compatível com padrões eram aceitos. Os melhores periódicos do Brasil foram convidados pela FAPESP para integrar o SciELO. CONICYT adotou enfoque similar para o SciELO Chile. Isso ajudou a definir a expectativa de critérios de aceitação seletivos para novas coleções nacionais.

Terceiro, o tremendo impacto do Google Scholar, que foi decisivo para avançar com o programa. Tão logo o Google Scholar começou a indexar o SciELO, o tráfico aos sites SCIELO aumentou de forma extraordinária, o que mudou tudo. O aumento contribuiu, de forma significativa, para suplantar a resistência que os editores tinham em relação à publicação online. Google Scholar mostrou aos publicadores, editores, autores e usuários que a publicação online era o novo paradigma para a disseminação de periódicos, e que o SciELO poderia ajudar a atingi-lo. O processo estabelecido pelo SciELO para criar versões estruturadas dos artigos e metadados, assim como a estandardização da formatação do artigo foram componentes-chave para o rápido sucesso do esforço de indexação.

Quarto, o sucesso e crescente aumento do uso do SciELO, aliado ao controle de qualidade dos periódicos, levou os sistemas nacionais de avaliação de pesquisa a incluir o SciELO como um índice nos seus critérios de avaliação. Isso favoreceu o aumento da submissão de manuscritos aos periódicos indexados, o que provocou um impulso adicional ao programa.

O outro objetivo em desenvolvimento do SciELO é o aumento do impacto da pesquisa comunicada por seus periódicos. Uma condição chave para isso é identificar e contabilizar as citações aos periódicos e artigos SciELO. SciELO computa indicadores bibliométricos abarcando os periódicos que hospeda. Para mensurar o impacto de forma mais abrangente, o SciELO contou inicialmente com o Web of Science (WoS) e Scopus. Entretanto, estes índices têm cobertura incompleta dos periódicos SciELO. Por exemplo, em 2014 a Scopus cobre 70% dos periódicos do SciELO Brasil e a WoS apenas 36%. Para resolver parcialmente esta falta de cobertura, SciELO celebrou um acordo com a Thomson Reuters para incluir, a partir de 2014, o SciELO Citation Index na plataforma WoS, que provê uma cobertura mais ampla, principalmente nas ciências físicas e da vida.

Entretanto, o Google Scholar tem uma maior cobertura mundial, sobretudo em ciências sociais e humanidades. Como resultado, Scholar Metrics oferece números de citação mais abrangentes. Estes estão sendo agora usados pelo SciELO para avaliar a influência mais ampla de seus periódicos. Scholar Metrics também é parte fundamental do processo de avaliação de novos periódicos que desejem ingressar em SciELO. Neste sentido, o que realmente gostaríamos de ter no Scholar Metrics seria a disponibilidade de séries anuais de indicadores, e a extensão do ranking de periódicos para além de 100.

SciELO e Google Scholar caminharam juntos um longo percurso. Juntos, nós temos ajudado a aumentar a visibilidade mundial dos periódicos da América Latina, bem como de Portugal, Espanha e África do Sul. Em seu aniversário, gostaríamos de parabenizar a equipe do Scholar pelo notável desenvolvimento do Google Scholar, um serviço de busca abrangente com que muitas mentes brilhantes outrora apenas sonharam. Vida longa ao Google Scholar!

10 º aniversario: SciELO, Google Scholar y las revistas de América Latina

| 12:24 PM

The second article in our 10th Anniversary Series is by Abel Packer. He is the director of the SciELO Program which has transformed scholarly publishing in Latin America. Given SciELO's multi-lingual reach, this post appears in English, Portuguese and Spanish. - Anurag Acharya.


SciELO, Google Scholar y las revistas de América Latina

Abel L. Packer
Programa SciELO/FAPESP, Diretor

SciELO tiene 16 años de edad. Hoy día publica aproximadamente mil revistas seleccionadas arbitradas en acceso abierto agrupadas en colecciones nacionales. La Red SciELO actualmente se compone de 16 colecciones nacionales, 13 de América Latina así como Portugal, España y Sud África.

El objetivo principal de SciELO es proporcionar visibilidad y crecimiento sostenible a la investigación comunicada por las revistas publicadas a nivel nacional. Cuando se lanzó SciELO, estas revistas eran solamente impresas con una base pequeña de suscriptores. Solamente unas pocas revistas estaban indexadas en los índices de citas y no había manera de determinar el impacto real o potencial que tenían la mayoría de las revistas en sus campos respectivos.

Hoy, estimamos en alrededor de un millón de descargas de artículos al día a través de la red, 500 mil de ellas (basadas en estadísticas compatibles con COUNTER) de SciELO Brasil. El número total de artículos alojados en la Red SciELO pasa de los 450 mil.

¿Cómo es que SciELO tuvo éxito en establecer una operación tan amplia y lograr un desempeño tan impresionante en términos de descargas? ¿Por qué tantos países y revistas se unieron a la Red SciELO?

Hay cuatro factores principales. En primer lugar, la reputación y liderazgo de las organizaciones de conducción. El proyecto SciELO fue fundado y mantenido por la Fundación de Investigación de San Pablo (FAPESP), ampliamente conocida en Brasil como la agencia de investigación más eficaz y avanzada en el país, y el Centro Latinoamericano y del Caribe en Información en Ciencias de la Salud (BIREME), que está afiliada a la Organización Panamericana de la Salud y la OMS. La motivación inicial de la asociación fue desarrollar un índice de citaciones que abarcara una colección más completa que las 17 revistas que entonces eran indexadas en el Journal of Citation Reports del ISI. Poco después del lanzamiento, la Comisión Nacional chilena para la Investigación en Ciencia y Tecnología (CONICYT) se unió al esfuerzo. A partir de 2002, el Consejo Nacional de Desarrollo Científico y Tecnológico de Brasil (CNPq) y otras agencias nacionales de investigación también empezaron a apoyar a SciELO.

En segundo lugar, los criterios selectivos de aceptación aplicados a las revistas de las colecciones SciELO. Solamente fueron aceptadas revistas arbitradas de acceso abierto con un consejo editorial integrado por reconocidos investigadores, una tasa de rechazo razonable de manuscritos y estándares compatibles con los procesos de publicación. Las mejores revistas de Brasil fueron invitadas por la FAPESP a unirse a SciELO. CONICYT tomó un enfoque similar para SciELO Chile. Esto ayudó a establecer la expectativa de los criterios de aceptación selectivos para las nuevas colecciones nacionales.

En tercer lugar, el tremendo impacto de Google Scholar que fue decisivo en llevar adelante el programa. Tan pronto como Google Scholar comenzó a indexar SciELO el tráfico a los sitios SciELO aumentó en una cantidad tan extraordinaria que lo cambió todo. Este crecimiento contribuyó, en gran medida, a la superación de la resistencia que los editores tenían hacia la publicación en línea. Google Scholar mostró a las editoriales, editores, autores y usuarios, que la publicación en línea era el nuevo paradigma de la diseminación de revistas y que SciELO podría ayudar a lograrlo. Los procesos puestos en marcha por SciELO para crear versiones estructuradas de los artículos y metadatos, así como la estandarización del formato de artículo fueron componentes claves en el rápido éxito del esfuerzo de indexación.

En cuarto lugar, el éxito y el creciente uso de SciELO, junto con el control de calidad en las revistas, llevaron a los sistemas nacionales de evaluación de investigación a incluir a SciELO como un índice en sus criterios de evaluación. Esto favoreció en un aumento en la presentación de manuscritos a las revistas indexadas, que proporcionó un impulso adicional al programa.

El otro objetivo permanente de SciELO es aumentar el impacto de la investigación comunicada por sus revistas. Un requisito clave para esto es identificar y contar las citas a las revistas y artículos de SciELO. SciELO calcula indicadores bibliométricos que cubren las revistas que alberga. Para medir un impacto más amplio, SciELO en principio se basó en el Web of Science (WoS) y Scopus. Sin embargo, estos índices tienen una cobertura incompleta de las revistas SciELO. Por ejemplo, en 2014 Scopus cubre 70% de las revistas SciELO Brasil, y WoS solamente 36%.Para resolver parcialmente esta falta de cobertura, SciELO estableció un acuerdo con Thomson Reuters para incluir a partir del 2014 el SciELO Citation Index en la plataforma WoS que proporciona una cobertura más amplia, sobre todo en las ciencias físicas y biológicas.

Sin embargo, Google Scholar tiene una cobertura más amplia en todo el mundo, más aún en las ciencias sociales y las humanidades. Como resultado, Scholar Metrics ofrece números de citas más completos. Estos son ahora usados por SciELO para evaluar la influencia más amplia de sus revistas. Scholar Metrics es también una parte clave del proceso de evaluación de las nuevas revistas que desean formar parte de SciELO. En este sentido, nos gustaría mucho disponer en Scholar Metrics las series anuales de indicadores y que se extendieran más allá de la posición 100.

SciELO y Google Scholar han recorrido juntos un largo camino. Juntos, hemos ayudado a aumentar significativamente la visibilidad mundial de las revistas de América Latina así como Portugal, España y Sud África. En su aniversario, nos gustaría felicitar al equipo Académico por el desarrollo impresionante de Google Scholar, un servicio de búsqueda exhaustivo que muchas mentes dotadas alguna vez solamente lo soñaron. ¡Larga vida a Google Scholar!

A Rapid Round of UI Changes

Tuesday, September 23, 2014 | 7:16 PM

Every now and then, we hear someone say, “Scholar never changes!” We, of course, know otherwise. But we do understand how it can be hard to notice gradual changes in someone you spend a lot of time with. To give you a bit of a peek behind the curtain, here are some of the changes that we have rolled out recently.

Author profiles

The citation histogram on the profile page can now be expanded to show  a larger and more detailed version when you click on it.  It's also available on phones and tablets - click on the "Citation indices" header to show the graph on smaller screens.  The citation histogram on each article's page can now be scrolled to display the entire range of years.

The "Follow" button on a public profile is now gray when you're already following the author.  It's still an inviting blue if you're not following them.

By popular demand, we've reinstated the "Articles 1-20" marker on the author profile pages, to help navigate a long list of publications.  This is automatically updated as you expand the list of articles or page back and forth.

Finally, we made it easier to close obsolete accounts.  You can now delete your Scholar account, without closing the entire Google account.  This is useful if you have accidentally created multiple author profiles.  You can delete your Scholar account from the "Account" tab on the Scholar settings page.

Search results

The links are now a darker blue, and the visited links a darker purple, to help readability.  We also removed the underlines, for a cleaner and more consistent look.

The "Cite" dialog is now a bit crisper  - we tightened the text and removed redundant options to save to my library and change the export format setting. The "Save" option still appears next to the "Cite" option under every search result; and the export format can still be selected on the settings page.

On the court opinions side, you can now share and bookmark links to specific pages.  Click on the page number in the margins or in the text, and copy or bookmark the URL with "#p123" at the end.

Posted by: Alex Verstak, Senior Staff Engineer

10th Anniversary Series: Helping Researchers See Farther Faster

Monday, September 8, 2014 | 5:00 PM

Google Scholar will soon be 10 years old. It is amazing how time flies. Seems like it was just yesterday that Alex and I were scrambling to put everything in place for the launch. To help celebrate the anniversary, we have invited friends and colleagues in scholarly communication to share their thoughts. About Scholar, about scholarly communication and about future directions. These will appear in a 10th anniversary blog post series. The first post in the series is by John Sack, the founding director of HighWire Press. - Anurag Acharya


Helping Researchers See Farther Faster


John Sack, Founding Director, HighWire Press

HighWire Press started at Stanford University almost 20 years ago -- we launched the Journal of Biological Chemistry Online in early 1995 -- about the same time that Google's founders were working in the same Stanford Quadrangle on the foundations for Google.  It took until 2002 to get our two efforts together and index HighWire-hosted scholarly articles in Google.  This project increased usage of the articles by one to two orders of magnitude, even though their abstracts had been fully indexed in PubMed right from the start. Two years later, in 2004, Google Scholar arrived.

In the twenty years since HighWire began, and in the ten years since Google Scholar beat a path to the door of scholarship, what have we achieved?   We know the answer to that question from interviews we did in 2002 and again in 2012-2014 with over sixty researchers.

Back in 2002, people still used the word "e-Journal" to describe the electronic version of a "print journal". Researchers told us they needed better ways to locate content across all the different sources of full-text – publisher sites each had their own separate search engines, and PubMed searched only abstracts.

We collectively solved that problem -- publishers took a big leap in providing the Google indexer with access to subscriber-only content.  So when HighWire asked Stanford researchers in 2012 interviews about the challenges of searching, they said:

   "Finding is easy..."
   ....but reading is hard."

We had so well-solved the search problem that people found more than they could handle. This wasn't just a relevance-ranking problem -- useless stuff showing up in search results. There was important material in those results and it needed to be evaluated to satisfy a researcher's sense of thoroughness.

Reading Faster


To “read” many articles in a short period of time, researchers want to be able to absorb the gist of an article quickly, and be able to judge its quality and relevance.  In our interviews with researchers, we heard strong support for adding visual abstracts to articles (as the American Chemical Society has been doing for years in all of its journals); for adding "take home messages" to articles indicating the significance of an article in the context of what is known and what the article adds (often found in clinical journals, like the BMJ, but now also appearing in basic-science journals such as PNAS and the JBC); and for a contextualized 'figure reading' experience (such as is found in the Lens viewer introduced in eLife).

All of these help researchers take in an article faster. None of these aids is available from Scholar search results, so readers must visit the sites where the full text is found. This “pogo-sticking” from search result to article and back and forth may seem normal and natural to us in the publishing industry. But as consumers we rely on Google showing augmented search results: if Google results stopped showing movie and restaurant “star” ratings, and restaurant price range “$$$” in its search results we’d think there was a bug!

How can Google Scholar meet this "read faster" challenge? How search evolves on this front will affect how researchers and publishers do their work of finding audiences.

Contextualization of References


One way to speed scholarly literature research would be to improve the “directedness” of search results -- don't just give me a list of articles, but give me or get me to paragraphs in context. Clearly Scholar knows the context for matching a query's criteria since it shows a snippet from the text.   Why not have Scholar and publisher sites collaborate a bit more to help readers get quickly from a result list to the first paragraph that matches a search, then on to the next matching paragraph, and so on.

And if Scholar can do that with search results, perhaps it can also help us with the too-arduous task of going from a citation embedded in an article, to the specific part of the cited article that is being referenced. Book references contain page numbers; why should journal articles be less specific?

Perhaps we can see how unhelpful this is by stepping out of our scholarly-publishing tradition and shifting to the consumer context: Imagine if a Google search provided you with a link to only the web site (i.e., home page) rather than to the specific page on a site that matched your search!  That's what we settle for with scholarly journal references.

Searching For Images


We know from researcher interviews that in some fields people don't start by reading the article text per se, they "read" the images and then look at the narrative around the images for context.  In some fields, figures tell the story -- just as in graphic novels and comic books, I suppose! -- and an article is figures woven together by text.   This isn’t only for disciplines that are visual in the traditional sense, but perhaps as true for equations in a physics article, structures in a chemistry article, or tables in a clinical-trial article.

So why not make it possible to search images by searching the figure legend, or text in a figure or table, or closed caption in a video. Google already provides a basic image search. Perhaps if publishers would provide Scholar with rights to display low-resolution article images – the visual equivalent of a snippet – we could have a scholarly version of image search.

There are great opportunities for innovation ahead of us. We will need to take some risks, build experiments and collaborate across boundaries between stakeholders. That’s what we have done for the past decade, and look how far we have come -- “finding is easy”!