I guess this thread is more about learning languages than language translation. Perhaps I am in the wrong place or should start a new thread of my own. I enjoy telling others what I know about the difficulties of translation, especially using computers.
I worked for two years in Germany and was exposed to people from at least 10 countries all working together on the Spacelab project. That was a fascinating experience. I remember hearing a lot of different kinds of English from people who spoke other languages natively. Even understanding the Brits was a little difficult and the Scots were fun to listen to. I can understand some French and I met a few people who spoke French, including one from Corsica who I imagined talked a bit like Napoleon. Another one was from the Congo and grew up in Belgium. He was about 7 feet tall, I think, and I am already quite tall at 6 ft 4.5 in. He makes me seem short. He now lives in the USA.
I met this American guy who had studied in Sweden and for a senior project in computer science, he wrote a simple language translator in the computer language called SNOBOL. That inspired me to want to build my own translator program, which I named iTrans. That was in 1980 and I am still working on it. One of my ideas was to build a handheld device that could do translation, like a pocket communicator or universal translator device. At the time I was thinking of using a cable between two such devices. Later on wireless became possible and smart phones were invented. Before that there were the PDAs (Personal Digital Assistants), Palm Pilots, Windows handheld devices, etc.
So my idea for a handheld translator stayed on the back burner for a long time until the past couple of years when I got my Motorola Droid X phone which runs the Android operating system. I managed to build a simple app that does very simple translations. I started out with the set (English, Esperanto, French, German, Italian, Spanish, and Russian). I have made the most progress with French and German, and I know some Spanish and a little Russian, and some Italian. I could add Portuguese to the list quite easily.
The basic idea is to translate any of the Source languages to Esperanto, then translate the Esperanto text to any Target language. That way I don't need so many dictionaries, since I would only need dictionaries that go to and from Esperanto, so for n languages I'd need 2n dictionaries, whereas if I translated directly from each Source language to each Target language I'd need n squared dictionaries. For n = 7, I'd need 14 dictionaries using the intermediate language approach, and 49 dictionaries using the direct approach. As n goes up, 2n goes up linearly and n square goes up much faster, so if I support 50 language I'd need 100 dictionaries with the intermediate language
approach and 2500 dictionaries with the direct approach.
I have dictionaries for about 80 languages, some more complete than others, mostly stuff I have found in books, typed in manually, or downloaded from online sources. It's been a labor of love since the 1980's. I used to love building dictionaries, back when I was using the direct approach, so I have lots of dictionaries which go from other languages to English and very few that translate anything else, a few that go from English to other languages, and I have a start toward the Esperanto dictionaries I need for the new approach (which is not my idea, just something that's been tried before, but not very successfully, on a project called Ergane).
I have also built an iPhone version of my new app. I'm struggling now with learning
how to do the User Interface (UI) for the iPhone. I've got a very simple UI scheme
that works on the Droid phones and tablets. I also plan for my app to run on iPads
and also on Windows and the Mac. I hope to make the Windows and Mac versions
sort of a homebase for the mobile apps, like an alien mothership with little shuttles
for buzzing around the planets they invade.
I already have a Windows version and I can easily create a Mac version from the
code I've already written for the iPhone, since iOS and OSX both require apps to
be written in Objective-C. Android uses the Java programming language, which
can run anywhere on any machine, but Apple discourages the use of Java or even
forbids it.
I am also interested in Microsoft mobile devices running Windows 7 (mobile version)
and the new Windows 8 (Metro). I don't want to spread myself too thin, but I also
don't want to put all my eggs in one basket. Android is doing extremely well now
against the iPhone, but Apple is determined to squash Android and is suing various
hardware manufacturers like HTC and Samsung for patent violations. I should have
patented my ideas back in the 1980s and then I could sue Apple and Android for
violating my patents. Actually I did have some good ideas, but I didn't have the
means to implement them in 1980. One of my friends back then invented an
electronic golf gizmo that let people swing a club and the computer would measure
how hard the ball was struck and estimate how far it would have gone, etc. It was
like an early version of the WII almost. Of course he drew up the plans and tried to
patent the idea, but didn't get too far with that. It was a very creative idea for the
time. Perhaps he got the idea from someone else or maybe it wasn't really a new
idea, just an application of an old idea. Back then people were still playing Pong and
Atari games and had no clue what was in store for them in the future, with PCs and
Macs.
I can also run my programs on Linux machines, and indeed Android is based on a
Linux kernel, and OSX is based on Berkeley Unix, modified and renamed Darwin by
Apple.
I would eventually like to be able to translate at least 100 languages, but right now
I'd settle for 7, or a dozen, or a few dozen. I have my Windows program which
does about 80 languages, none of them perfectly, but some of them reasonably
well. Last night and this past week I've been working on translating Snow White
and the Seven Dwarfs (Dwarves is now obsolete, I think), from various langauges
to English. I've done German, French, Dutch, Croatian, Czech, Slovak, and a few
more. I've also been playing with Icelandic lately. I translated the book of Genesis
to English from Icelandic, last night. Genesis contains 50 chapters. I can understand a lot of the output, but it is certainly not ready for prime time.
I've used Grimms' Fairy Tales and Hans Christian Andersen's Fairy Tales for test cases for my translator for many years. I started out trying to translate Dutch in 1980, later German and French. I've added other languages over the years, including Low German, Old English (Anglo Saxon), Middle English, all the Scandinavian languages, Finnish, the Baltic languages, Slavic languages, etc. I have about 80 languages in my Source and Target menus, but I can't translate all of those languages equally well, nor do I expect to be able to do that in this lifetime. The best I can do is make sure the code works right and let other people populate the dictionaries, if my program works well enough to warrant anyone else getting involved. Sometimes I think of recruiting volunteers who can translate their native language to/from Esperanto, so that I could build up my dictionaries over time. Right now I mostly translate fairy tales which have been translated by humans, as well as the various foreign language versions of the Bible, of which there are hundreds, perhaps thousands. I have the text of at least 50 versions of the Bible that I can use for testing my translator.
I was looking for a group where people were interested in translation, but most of the groups are not interested in what I am doing, which is not what one would call "professional translation" and I don't know that my programs will ever be able to provide that level of translation. I can currently translate some other languages to English and the results are not fluent English, but they are often understandable enough for people to get the gist of what the original text was saying.
Writing a translator that can produce output as good as a human translator may
well be impossible. Even the best computerized language translation systems
leave a lot to be desired. I can produce simple literal or rote translations and even
recognize a lot of idioms. However, to really translate accurately, one has to
understand the text completely and render it into the target language as well as a
native speaker of that language would.
That just isn't possible, except in very limited situations, such as translating
weather reports, which tend to use the same format and the same terminology
over and over again. In such a situation, one can translate almost perfectly,
but it comes from mimicking what human translators would do, only mechanically.
It's pretty easy for me to simply look up words and output their equivalents, but
that isn't very useful, since many words are inflected or conjugated and one has
to recognize all the affixes (prefixes, infixes, and suffixes) which modify the meaning
of a core word from the dictionary. One has to recognize case, gender, person,
number, form, tense, mood, aspect, transitivity, active/passive, and all sorts of
other grammatical things in order to be able to even do what one could call rote
translation or literal translation, and often times literal word-by-word translations
make no sense at all, but often they do make sense.
So I could probably bore you to tears by telling you how my program works in
exhaustive detail, or show you lots of examples of the kinds of output it produces,
or of the kind of errors it tends to make (areas I need to improve), but if no one
is interested, I'll just go somewhere else. I don't mean that to be mean, just don't
want to waste my own time or yours with something you are not nuts about. I'm
kind of nuts about languages and translation, especially programming language
translators. I'd love to meet some other people who do the same thing, but I have
only met a few, and some of them have already given up on it, and some have
worked years without making any significant progress, nothing to write home about.
I could write a book or two about all the stuff I have done, and it would be dismissed
by academics as unprofessional or unscientific or not of any real value.
I have studied languages all my life, since I was about 6 years old, but I really began
in earnest when I took Spanish and French in high school, and I have never stopped
studying languages since I was a kid. I have mostly concentrated on learning
grammar and vocabulary without actually learning to speak all the languages I have
studied. I cannot speak any language fluently except for English, and I can make
myself understood in German and I know enough French and Spanish to carry on
a simple conversation and order a beer most anywhere. Of course I can order a
beer in English in most every bar in the world. I don't go to bars. Maybe I should.
At any rate, I have dabbled at translating many languages, and I know something
about most all of the major languages. Unicode has made my job easier, as far
as programming goes, so I can load and display all sorts of stuff I could not have
done back in the days of MS-DOS and the early days of Windows.