x-metodo akceptata

Usona Esperantisto

Dumonata bulteno de Esperanto-USA

ESPDIC: The Esperanto-English Dictionary Project

Paul Denisowski

ESPDIC is a online Esperanto-English dictionary project that documents Esperanto from a descriptive (rather than prescriptive) point of view. As such, it contains many alternate and rarely-used terms not found in more traditional dictionaries. Readers of ESPDIC can help advance the project by contributing corrections and additions.

To explain ESPDIC’s history and goals, we’ve invited compiler Paul Denisowski to present an overview of the project. You can also visit the ESPDIC home page to search the online dictionary and learn more about ESPDIC.

Background

Dictionaries have always been one of the most important tools in language study, both for the beginner and the advanced learner. This is especially true for students of Esperanto, since opportunities to learn vocabulary and usage from other speakers or from print/media are much more limited than in the case of national languages.

Fortunately, a number of good Esperanto-English dictionaries have been available for some time. Butler’s Esperanto-English Dictionary (1967) was one of the more comprehensive dictionaries available for many decades, and Wells’ Teach Yourself Esperanto Dictionary (1969) was recently updated (2010) and is available in both softcover and hardcover.

Paper dictionaries do however suffer some drawbacks, primarily in terms of portability and speed. Carrying a paper dictionary (much less multiple dictionaries) in one’s pockets or bag is not always practical, and manually searching for a given term can be time-consuming. Electronic dictionaries, which do not suffer from these limitations, have existed since the beginning of the computer age, but in the past these were often too small to be useful – it is, after all, the less common terms that one tends to look for in a dictionary.

Electronic Dictionary Projects

Fortunately, the growth of the Internet has enabled of a number of successful electronic dictionary projects. Possibly the most successful of these has been Jim Breen’s EDICT Japanese-English dictionary. Started in 1991, EDICT now contains over 160,000 entries, most of them contributed by the user community, and has been integrated into a large number of web pages and applications.

As a contributor to EDICT, I became interested in starting a similar project for Chinese and launched the CEDICT Chinese-English dictionary in 1997. The original core of this dictionary was made up of vocabulary lists from various Chinese textbooks (about 1,200 entries total), to which were added words and expressions I came across while reading. When the dictionary reached approximately 20,000 entries, I began making it available on my website and asked for user contributions. The dictionary has now grown to over 100,000 entries.

Launch of ESPDIC

Since my early days as an Esperantist in the mid 1980s I had been keeping (in paper notebooks) lists of words and phrases I came across in my studies and reading. Several years ago I began converting these to electronic format as well as compiling vocabulary lists from Esperanto textbooks. I then began to supplement this proto-dictionary with word lists found on the Internet and released the first version of ESPDIC in May 2011. Since then the dictionary has grown rapidly and as of July of 2013, ESPDIC has over 60,000 entries.

There are several criteria I use for adding entries to ESPDIC. Although the greatest strength of Esperanto is the ease of deriving words and coining expressions based on roots and affixes, the most important rule is that I never add my own creations into ESPDIC – all entries must come from other (primarily printed) sources in the Esperanto community. The converse is that (generally speaking) the dictionary is descriptive, not prescriptive: it simply captures the words and expressions that are being used in Esperantujo rather than trying to suggest or dictate “proper” usage. One goal of ESPDIC is also to include rare, unusual, or technical words, as these are also unlikely to be found in most other Esperanto dictionaries.

The latest version of ESPDIC (as a single text file) can be found at on the ESPDIC home page. It also contains additional information about the project as well as a simple interface for searching ESPDIC online.

Role of technology in ESPDIC

Although the Internet has been the greatest technological enabler of ESPDIC, custom software tools have also played a critical role in the development and maintenance of the dictionary. In the early phases of the dictionary I developed Perl scripts for the more mundane tasks of resolving duplicate entries, sorting the dictionary (in Esperanto alphabetical order), and detecting common formatting errors. Current tools development is more focused on leveraging various external resources, such as checking ESPDIC entries against search engines and flagging for review those entries with very low hit counts. Additional tools will provide checking against other dictionary resources (e.g. finding words in the PIV but not defined in ESPDIC) and parsing of texts (such as the those from Project Gutenberg) in order to flag undefined words and expressions. Despite the substantial benefits these tools provide, they cannot replace human proofreading and editing – more on this below.

Use in other projects

ESPDIC is licensed under a Creative Commons Attribution 3.0 Unported License. What this means is that anyone can use, transmit, or modify ESPDIC for any purpose, including commercial purposes, as long as the source is properly attributed. The format and licensing of ESPDIC was designed to facilitate its use in a wide variety of projects.

In addition to a simple search interface on the ESPDIC home page, there are a growing number of applications which make use of ESPDIC. And despite the fact that my initial expectation was that ESPDIC would be used in projects that are electronic in nature, I’m very pleased to note that TWO printed versions of ESPDIC have already been published to date. It is my very sincere hope that the Esperanto community will be able to leverage ESPDIC in many more ways in the future.

Future direction / call for volunteers

Now that ESPDIC has reached over 60,000 entries (the size of a “large” printed dictionary), I plan to concentrate efforts in two main areas: usage examples and proofreading / editing of existing entries.

One shortcoming of many dictionaries (Esperanto and otherwise) is that lack of usage examples, i.e. how words are used in common phrases and contexts. In addition to “tempo = time” a dictionary should also have examples of common constructions and phrases using this word. To this end, ESPDIC contains entries such as “de tempo al tempo, dum restas tempo, ĝis la nuna tempo, je reala tempo, kuro kontraŭ la tempo”, etc. While reading or listening to Esperanto I usually jot down any new or “interesting” phrases that I hear, and I would highly encourage Esperantists to consider gathering and submitting their own “finds” to the ESPDIC project – this is an area in which experienced Esperantists can make a tremendous contribution.

Although I have manually read through the entire dictionary at least twice and regularly make changes and corrections to existing entries, there is also a real need for skilled Esperantists who are willing to check entries and suggest any necessary corrections or additions. No level of participation is too small – even a single entry or edit is welcome and all contributors will be acknowledged.

“Vortaro, tio estas la tuta universo laŭ alfabeta ordo”
— Anatole France