20160527

dictionary.com as a StarDict dictionary

Here we go:
Word entries : 439,676
Definitions: 149,135

This is a 2.4.2 compliant StarDict dictionary, meaning that it should work pretty much anywhere

Version 2.4.2 of the StarDict format doesn't support the synonyms file, nor does it support multiple definitions for the same word. To get around this restriction, I put all the definitions one after another for a same word.

For example, instead of having 3 different entries for 'fan' like so:
  1. any device for producing a current of air
  2. an enthusiastic devotee
  3. one of the long, sharp, hollow or grooved teeth
there will be only one entry 'fan', with 3 sub-definitions like that :
  1. 1.fan
    any device for producing a current of air...
    --
    2. fan
    an enthusiastic devotee..
    --
    3. fan
    see Fang..
Doesn't change much right?
All of the synonyms, different spellings and stuff for 'fan' will point to this single big ass definition.

You can use this dictionary on your desktop computer using GoldenDict:

GoldenDict

And of course this will work on many eBook readers that supports StarDict dictionaries. Like the Onyx Booxes where it's natively supported. Works on PocketBooks although with KOReader as native dictionary is for the shit Lingvo format. Which is the reason I did all of this.


Wanna know how I did all of this?

  • First I tried to do this. But this failed miserably.
    The XDXF -> ABBY Lingvo converter was the last straw, and I gave up.
  • After many months, I got pissed again, just like last time.
    Except that since I was using KOReader, I could try something new: converting dictionary.com as a StarDict dictionary.
  • Which I did. With success this time.
    Head there over GitHub for the source code.


The converter in action

Enjoy before this whole shit gets taken down!
╯°□°)╯︵ (~ .o.)~



20160525

Change of plans: forget XDXF and Lingvo dictionaries -> use StarDict

Yep.

I just saw that KOReader supports StarDict dictionaries. That's a good news since the previous converter had limitations and I just couldn't convert my 700mb XDXF to the Lingvo format without loosing too much stuff.

So, change of plans. Let's convert dictionary.com to a StarDict format that can be then read by KOReader on my PocketBook.

And... I'm almost there. StarDict files are all created and it works \o/
I just need to add the synonyms, either by using a .syn file or by adding the synonyms as regular word entries. Not sure yet.

Anyway I deleted my old project DictionaryDotComToXdxf from GitHub.
Instead, let me introduce offline_dictionary.com.

Early version
And it works using StarDict 3.0.4 rev 10.

When I'll be satisfied of my StarDict version of dictionary.com I will post it here, and spread it around the world I guess.

20160328

dictionary.com as a XDXF

What's up world.

So, lately (a couple years ago actually) I have been extremely pissed with the lack of good English dictionaries on the net, and especially on eBook Readers.

See, I used to own an Onyx Boox a long time ago when eReaders were still a novelty. On my defunct Onyx, we could copy/paste dictionaries in StarDict format somewhere in there, and then we could use them while reading a book. Pretty awesome. StarDict dictionaries are lying around on the interwebz and it was easy to grab a few of them 'free of charge.' If you know I mean. Then we could also convert some old Babylon dictionaries to StarDict (that is before Babylon switched to a more locked down format).

Life was almost good. But then lately my Onyx broke the fuck down, and its battery became very erratic. So I bought a PocketBook Lux 3 (626) last week.

This new eReader is pretty awesome. No Android OS there sure, but seriously, this home made Linux OS is very well made. Better than the one from Onyx actually IMHO.

Well anyway, there is only ONE thing that was grinding my gears (and balls):
Even though we can put our own dictionaries in there, it only support ABBYY dictionaries. And these are much less common. So for once in my lifetime, I was ready to BUY a dictionary on their shit Polish platform or whatnot. But the issue is that the available dictionaries are simply SHIT. I mean, they don't have my lovely American Heritage 4th Edition. Therefore, yeah, they are SHIT.
Doesn't matter some Russian dudes made an XDXF to ABBYY converter. It's right there, hosted by me because... just because.

So... since I was really pissed lately (for a couple years) because of the lack of availability of various dictionaries -- whether for eReaders or even on the net in general -- I decided to hack something myself.

The idea is the following:
Creating the American Heritage in an XDXF format which can then be converted in any other format by the smart people on the interwebz.

Except that, instead of the AA 4th edition, I simply hacked dictionary.com and transformed it in an offline XDXF version.

Because, as I recall, dictionary.com was using the AA dictionary a long time ago. Then something changed and they moved to the Random House publishing company. But they were still providing the same dictionaries. And then dictionary.com became its own shit, and its own dictionary. So that's the one you will be downloading below.

So I managed to convert dictionary.com to an XDXF. But, I couldn't convert it to the ABBYY format using the Russian's dudes tool. I may work on that later.



Here is the XDXF file (547,620 KB unpacked):



For coders: go there and learn how to do what I did by yourself and get the source code of everything.
For free. So go and do it.



As for the legality of this whole thing, well, it's not. And I don't give a fuck.

So I advise you to download the stuff on this page even faster, and then to hide them somewhere very deep.

Now I need to convert that huge XDXF into an ABBY format... gotta think about that.

Peace.