The Ultimate offline Dictionary with Dictd in Linux

Posted by:

The Ultimate offline Dictionary with Dictd

Installation dict in Arch Linux

  1. dictd installation Follow instruction here.

  2. Install dictionary database for AUR repository (search dictd).

    There are list of bilingual dictionaries from freedict are available from AUR.

  3. Gather offline dictionary databases There is a old tutorial about how to set it up, but ftp server for all the dictionary databases are down permanently. After a long research, I found that you can retrieve them from file watch, some of which are pre-formatted.

    Note 1: If encounter following error when execute make. just to go to dictfmt.c file (line 467, or whatever line it prompts) add ; after skip:, inspired from here.

    Note 2: I could not find the law dictionary (dict-bouvier) anywhere; what I ends up doing is the install it on Ubuntun copy the database files over to my Arch Linux.

  4. update dictd.conf

  5. Restart dictd

Dictionary Lookup with Goldendict and dictd server

Once the dictionaries have been installed, and dictd has been configured, goldendict can be configured to access the dictd dictionary databases by following these steps:

  1. Press F3 to bring up the “Dictionaries” window.
  2. Click on the “DICT servers” tab
  3. Click “Add…”
  4. Click to put a checkmark under “Enabled”
  5. Double click under “Address” and type: dict://localhost. Make sure to change localhost if dictd is running on another server.
  6. And finally click “OK”.

Access through web browser

see instruction here, and sample site. a perl CGI one here. WWW Frontends for DICT Servers.

Note: Please configure dict.conf (NOT dictd.conf) according.

Access from Emacs with builtin dictionary.el

There is a dictionary.el is present with XEmacs distribution. M-x dictionary RET

The Resources

The original author site here. Emacs Official dict-mode here. dictionary.el github here here. Chinese Dictionary here.

Emacs init

A few notes on the conversion of the CIA world factbook 2002

The following is paraphrased from the world2 package which I retrieved from file watch.

The CIA world factbook 2002 is available in HTML format in different versions at the CIA website http://www.cia.gov/cia/publications/factbook/ . Unfortunately there are no plaintext / rtf / doc versions available from the CIA server.

The HTML versions differ with regard to maps and other graphical elements which are provided in the fullblown version but are missing in the “low bandwidth version” at http://www.cia.gov/cia/publications/factbook/countrylisting.html which was used for the creation of the dict version. In addition to the country listings the 6 appendices found in the full version are included in the dictionary.

downloading of files

The country listings were downloaded using wget:

The appendices and the copyright info were downloaded manually and saved to files.

conversion to txt

For html to text conversion html2text was used as lynx -dump does not retain the table structure of the text:

html2text does a good job with regard to the text layout but the original HTML format leads after conversion to a two coulumn output where both colums have approximately the same width. Not ideal for this application but it works reasonably well.

pre-formatting

In order not to reinvent the wheel the final formatting to dict format was done by dictfmt. Prior to the dictfmt run the country listings as well as the 6 appendices and the copyright info needed to be cleaned up. This was done manually for the appendices and the copyright info as I considered it not worth the trouble to write a program for this task. The country listings were processed by a small custom preprocessor written in python (see convert.py).

convert.py makes a few assumptions where to find its input. If you intend to use it for your own purposes please adapt the configuration section to your needs. Make sure that the appendices and country specific files are located in different directories and that there are no other .txt files in these directories.

convert.py reads the text files in the countries/ and appendices/ directories as well as the copyright file and writes the preformatted result to stdout:

Before exiting the script creates a table of contents.

creating the dict database

you guessed it:

finally:

Webster and Emacs with SDCV

If you just want the Webster’s dictionary, the sdcv is another way with the following steps. adapted from here and the original post: here’s how you do it (at least on a GNU/Linux system).

  1. Download the Webster’s dictionary in StarDict format. (Apparently it’s not “some strange format”, but a standard format for a digital dictionary.)
  2. Unzip the files and put them in ~/.stardict/dic, or /usr/share/stardict/dic
  3. Install sdcv, a command-line utility for accessing StarDict dictionaries. (On Arch GNU/Linux, follow instruction here. )

  4. Don’t go to Melpa for the sdcv package; it’s usable, but slightly broken. Get sdcv-mode from here instead and load it in Emacs.
  5. Now, with point on a word you want to look up, say M-x sdcv-search and confirm the selection with RET (or just say M-x sdcv-search anywhere and type the word you want to check).
  6. You can press RET on any word in the definition to look that one up. Sorry for destroying at least a few hours of your life with that tip.
  7. Relish thy copy of Webster’s in Emacs.

Google Translate to serve as a command-line tool

Direct quote from its official website.

Translate Shell (formerly Google Translate CLI) is a command-line translator powered by Google Translate (default), Bing Translator, Yandex.Translate and Apertium. It gives you easy access to one of these translation engines in your terminal:

Give it a spin:

Translate Shell can also be used like an interactive shell; input the text to be translated line by line:

Some useful command:

2

Comments

Add a Comment