Copier Auto-Translates Japanese to English
StCredZero writes "Wild. Fuji has created a photocopier that automatically translates documents from Japanese to English. That's pretty nuts. Apparently, the copier can figure out what sections are text, OCR the text, send it to a translation engine, and put the english back into place."
Turn into actual pictures of people, too! Amazing!
I suppose the next version of this printer will be able to convert tentacle porn to... Oh God! /puke
How long has OCR software been around?
...
All you do is have a scanner - it scans. OCR puts the text together. You run it through babelfish, through the text back on and you have it.
Well... We all know how those translators work.
OCR - OCR babelfish
...post!
They've been using this for years to translate instruction manuals.
Imagine if you upload manga scans to Flickr, and it automatically translates them to English.
Imagine if you upload anime to YouTube, and it automatically includes an English subtitle.
Virtual Betting on Facebook for non-geeks.
StCredZero "writes wildness. Fuji drew up the photocopying machine which automatically translates the document from English from Japanese. That is the clean nut. With respect to appearance, as for the copier the text, as for OCR what kind of section text, to send that to the translation engine, and in the place English". You reset, or can grasp.
main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}
It translates to Engrish?
I reserve the right to think for myself. Others' opinions are optional. Puppy on lap = typos...not illiteracy.
It doesn't seem to be mentioned in TFA, but I have to wonder: Exactly how fast does it copy if it has to translate? I'm sure it's not the near-instantaneous work we've come to expect of our Xeroxes. If the translations aren't just gibberish Engrish, its usefulness will be immense, so the time won't be so much of a concern; but I do still wonder.
98% of America's teens drink alcohol, smoke, and have sex. Put this in your sig if you like bagels.
"Why, so can I, or so can any man, But will they come when you do call for them?"
--Shakespeare, Henry IV, pt. One, act III
Do you even lift?
These aren't the 'roids you're looking for.
The version I want is where I scan in a Tenner, the machine sends it to a "Translation Engine" & the output is a Score :-)
If I had an Ass, I'd call it Fanny Bottom, then I could slap my Ass; Fanny Bottom, on the Arse.
is about to belong to this photocopier if it translates things in a very literal fashion....
You have no chance to survive make your time. :)
US businesses that currently accept chip and PIN/signature
Between the inaccuracy of unproofed OCR and the poor quality of machine translation, I can't imagine that the results are very good.
LOAD "SIG",8,1
I think of that this is rather sweet. As for converting Japanese rather than easy, converting other manner with a certain manner, with other manner hard. As for existence of Chinese character, for example, thing is made easier. But (with easily from Chinese character. Chinese which becomes complete)
-:sigma.SB
WARN
THERE IS ANOTHER SYSTEM
All your base are belong to us
Looks like any machine that can produce a full page from one language to Japanese, English, Korean and both chinese character set(Mainland vs Taiwan classical) within an hour would be a winner. Unfortunately if there is no proofreader, the whole process will result in failure, embarassment and red faces...
Hey, I just got one of these and all it'll print is "All your base are belong to us."....
Don't tell me to get a life. I'm a gamer; I have LOTS of lives!
[Reading from his book, "Jimmy James: Macho Business Donkey Wrestler," translated to Japanese and back again]
Jimmy: I had a small house of brokerage on Wall Street. Many days no business comes to my hut. Jimmy has fear? A thousand times no. I never doubted myself for a minute, for I knew that my monkey-strong bowels were girded with strength like the loins of a dragon ribboned with fat and the opulence of buffalo...
[pauses while turning page]
Jimmy: dung.
Finally! Now I can pour my collection of Hentai into one and enjoy the interesting story lines and character development...
But I wonder does the english language contain enough exclamations though: Uh!, Ah! ?
www.tribalnetworks.org - helping tribal people around the world to own their own means of high-tech communications
I can't wait to see the translator mangle translations.
So how good is the translation engine? Currently Babel Fish and Google both suck at translation Japanese (to English). If it's any good, I would like to see a web based version.
...for a machine, under the current paradigms (that is, no true artificial intelligence) to properly translate something. Translation is not an exact science, and you can't expect to get a decent translation by just having a word-per-word approach. Heck, not even a sentence-per-sentence or paragraph-per-paragraph approach would ever be enough. Translation requires deep social knowledge--you need to know what you are translating, from whom you are translating, for whom you are translating... that is, you need to enclose your translation in a sociological context. No machine can ever wish to do that without artificial intelligence. It's hard enough as it is to get a human being to understand that word-per-word translation is stupid--imagine telling that to your CPU core.
;)
Disclaimer: I'm a translation student myself
You fail it!
General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
Don't ask about the jointly developed update to the Hitachi Magic Wand. Just don't ask.
"Wildness. Fuji drew up the photocopying machine which automatically translates the document from English from Japanese. That is the clean nut. With respect to appearance, as for the copier the text, as for OCR what kind of section text, to send that to the translation engine, and in the place English". You reset, or can grasp
If you sit on the glass and photocopy your ass, it just switches to "Enlarge by 50%" mode.
In Democratic Japan, translation photocopies you!
What is so wild about it? So they took a scanner, OCR software, a translation engine and a printer and put them all in one box. Well actually not even that since it has to bee hooked up to a computer for the translation part. How about replacing the translation engine with a spell checker? Hooray, call the patent office, I just invented a photocopier that fixes the spelling on the documents!
Negative moral value of force outweighs the positive value of good intentions.
Another friend twists? Nose presses slashdong. The stray digs an initiate tough. Why does slashdong steam without the four curtain?
Wild. Fuji made the photocopier that automatically translated the document from Japanese to English. It is a beautiful nut. Obviously, the copier can return the place english by calculating the text what kind of section is a text, and OCR, and sending it to the translation engine.
Google translates sucks bad. Excite usually give way better translations.
Did you translate this post on the copier?
"There is no time, sir, at which ties do not matter," Jeeves, (Jeeves and the Impending Doom)
-FL
Too bad it can't translate the article fast enough.
Google cache: http://64.233.169.104/search?q=cache:iMyv1y2mOAkJ:www.digitalworldtokyo.com/index.php/digital_tokyo/articles/photocopier_translates_japanese_to_english_at_touch_of_button+http://www.digitalworldtokyo.com/index.php/digital_tokyo/articles/photocopier_translates_japanese_to_english_at_touch_of_button/&hl=en&client=firefox-a&gl=us&strip=1
This may help people from making some big mistakes with their tats...
Captain: What happen ? ....
Mechanic: Somebody set up us the bomb.
Operator: We get signal.
Captain: What !
Operator: Main screen turn on.
Captain: It's you !!
CATS: How are you gentlemen !!
CATS: All your base are belong to us.
CATS: You are on the way to destruction.
Captain: What you say !!
CATS: You have no chance to survive make your time.
CATS: Ha Ha Ha Ha
Operator: Captain !! *
Captain: Take off every 'ZIG' !!
Captain: You know what you doing.
Captain: Move 'ZIG'.
Captain: For great justice.
Looks like it works well...
Instead of the next guy complaining he's getting duplex and stapling, you'll have him pulling the box apart to figure out why it's in a foreign language. Helpdesk fun for all.
Darnit, I was hoping to be the first one to make an engrish joke.
...before we see photos of the scanned documents in http://www.engrish.com/ ... What were they thinking of??
The Wknd Sessions - Malaysian and South East Asia independent music
all your documents are belong to us
Our those which achieve the main thing of the ether of title in order to know never clearly that remnant because of the remainder which it should find!
Since I could never have created the above err, prose, myself, I typed the following answer into babelfish and translated it into Japanese, and for good measure, back into English.
Clearly that remains for those of us who have achieved the title Ether Lord to know, and for the rest never to find out!
And as you tread the halls of sanity, You feel so glad to be, Unable to go beyond. I have a message, From another time..
The translator of the specialist may face when being the technology which in the job placement place which crosses the world Fuji Xerox developed because of the most recent prototype photocopy machine it makes well, directly.
The device just of the Japanese show [subscription link] presently, while maintaining the layout of the origin, scans the seat to which the Japanese from the newspaper or the magazine text is printed, Chinese, can stir English or Korean that translation. Repel the switch of opposite direction and the work of language analysis.
As for the secret of Fuji Xerox the text for designing the maintenance, in algorithm and the enthusiastic translation server and connects this being able to distinguish during sketch and the line there is a nameless copier in the networking.
Concept 1 while touching, as for the translation machine because of the thing is splendid thought with someone who works systematically with many language, (...In order a crime is), for Babelfish and Google to translate, when it is the sentence which becomes ruinous the place was seen, transferring/changing of technology from present formation of the machine-translation (MT) software which is probably will be desired.
All your copies are belong to us!
Google uses statistical means to do their translations. Look it up.
Either you are old, or a bit naive. I think in the next 10 years we will see significant improvement.
Yeah, 'cause researchers have long promised us that AI will reach us in 10 years. <sarcasm>
Seriously, I think you underestimate the difficulty of translating. Have you done any major foreign-language translation -- especially of conversational speech? My experience has primarily been with Japanese and English, and I'll tell you right now that it can be nightmarish.
Sentence fragments are the worst part. Japanese has a completely different word order from English. All modifiers (including phrases and clauses) come before the word they modify, and the language has a Subject-Object-Verb order. "I just saw the man who stole my friend's watch last Tuesday" becomes "Just I Last Tuesday friend's watch stole man saw." Now try translating that from Japanese to English when the sentence is cut in half.
Worse, the language has very different levels of allowed vagueness. "Complete" sentences in Japanese can contain just a descriptor or an action without any specification of who did/was what. Conversely, translating "3 of them" in English to Japanese is hard because you have to know "3 of what?" to know what counting suffix to use.
Another problem is that many very different words sound exactly the same when conjugated to the gerund or perfective forms. English has a number of homonyms, but there are MANY more opportunities for mix-ups if you don't have access to kanji to tell the semantic meaning apart because Japanese has a much more limited range of phonemes. For example, take "katte" which is the gerund form of the verbs "kau" (buy), "kau" (keep/raise), "karu" (cut), "karu" (spur on), and "katsu" (win). That's 5 completely different verbs that conjugate to the same sound. If they're written phonetically or your going from speech, then you have to be able to understand the meaning behind the words to translate. (Did I mention earlier that you may not have an explicit subject and object to go off of?)
Then you get into issues of translating things like politeness levels, different ways of addressing people, and other concepts that don't translate well into English or concepts like singular vs. plural that are dropped in going to Japanese. Let's not even consider puns and poetry!
These are not trivial issues. An automatic translator would need to somehow be able to conceptualize what a person is trying to speak about, which would require understanding the story being told and an ability to predict where they are going with it. This will require strong AI.
Accurate and intelligible translation is an art -- not a science -- because it requires an intuitive and empathetic ability to understand the mind of the speaker well enough to map their thoughts into a different method of expression.
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
is that you?
Pretty soon, you'll be hearing this at the office: "Damn copier! First, I got a paper jam. I finally got that fixed - but my hands are all covered in toner. Now, all my documents are coming out in a different language! What's going on? I can't even tell what language this is! Grrrrrr"
Is it standalone, or does it phone home? If it sends the content out for translation, it's a huge security hole for an organization.
They show a picture of the machine, I see. But what they don't show a picture of, is the before / after pictures in various languages. I'm not impressed.
This is vely intelesting! Vely!
Doshte nobody invented befole, I don't know! Cullently I'm using baka velsion online; it wolks pelfectly!
--- "To pee or not to pee, that is the question." ---
What is wild is that anyone with half a working braincell would use a photocopier in an office where a copy of every document is sent to an uncontrolled 3rd party for translation.
Yeah, put that baby in the CEO's office..
(not the mention the fact that there's a huge gap between mechanical translations and the subtleties of language only a skilled translator and/or native speaker has any hope of translating).
So, IMHO cute idea, but don't expect me to bu one any time soon.
Insert
Common exercise: take the article, drop its text into Babelfish, translate it from English to and back again. When doing so from English to Japanese and back, the results are:
Something has been lost, I think. Let's try a quick trip through Europe, though. English to Spanish, Spanish to French, French to German, and German back to English:
Honestly, I'm impressed, "as the fish and Google of Babel translate."
$nice = $webHosting + $domainNames + $sslCerts
My street is taken over by Latino gangs. (Most of the gang members are kids from Central America.) These guys hang out in front of my building selling rock all day.
The owner's had cameras installed to deter the selling of crack in front of the building. The problem is, nobody cared. The cameras are small and not easily seen. They went about slingin' rock as usual.
I decided to print up a sign with a big camera on it. I typed out something like "This building is watched by Surveillance Cameras. Any illegal activity will be recorded and taken to the police." (Don't worry all you privacy zealots, the police don't care about recordings of gangs dealing crack. I tried. I just want them and the crackheads they attract to go elsewhere.)
The problem is, they don't speak English and I don't speak Spanish. Rather than ask somebody I went to Babelfish and typed it in and pasted it into my nifty sign.
I posted these signs all around the building. An hour later they were all gathered around one sign laughing at the ridiculous translation. They took them down to show to their friends. Apparently the were really funny.
You can still buy crack in front of my building.
Adobe Acrobat Pro. does OCR that separates graphics from text, scanning TIFF files into PDFs, so there's nothing new there.
For the translation, what it didn't say is where is this remote server that the printer is connecting to, and the quality of the translation. This remote server can very well be google translate or babelfish.
Correct me if I'm wrong, but aren't Japanese books read from right-to-left top-to-bottom? If so, I wonder if you could photocopy a page straight from a Japanese book and have the translated page written in standard English left-to-right format?
I to buy that exactly, to type this short text, and from Japanese from English
You tried in order to translate. As for that you see rather in me
It is good, don't you think? so is?
I'm not worried. Or, as this copier would say, The troubling one it is to have [indef. pron] not very and much.
you are in a foreign country and you want to say something to a local. Dial up a number on your mobile, say what you want to say and either the translation gets spoken back to you or it turns up, a few seconds later, in a text. I know speech recognition is a bit harder than OCR but it's only a matter of time before they perfect it.
I have excellent Karma and I am not afraid to Troll it.
Does John Searles Chinese Room argument go as prior art for a patent?
Break the sound barrier - bring the noise.
As an research chemist for a large company, I have had to teach myself the unique language of machine translated Japanese for when I'm in literature search mode. We pay $30 a pop for an instantaneous machine translated JP patent through a web-based service. The service is tuned to patent phrases so not as bad as Engrish, but it takes getting used to. The translations are good enough to get the gist, but if the reference is going to be used as prior art for a patent filing, we'll spring for a human translation which costs a whole lot more and takes a lot more time to get. I wonder how this copier would stack up against the service we use in a cost/benefit analysis?
I guess the translation backoffice server farm will see a lot of requests with vocabulary like: tentacled monsters, school-girl superhero, Crystal Tokyo, giant mecha, exoskeleton, space battleship, katana, ecchi, hantai, otaku, bishoyo, bishonen, senshi, H-rated, bukkake, gokku etc.
I read via another /. story recently that automated voice translation on the fly is a hot issue for darpa these days. I wonder if the algo used by this OCR method would be of any assistance in getting a two-stage unit that uses voice recognition off the ground?
Obviously the translator was all at sea.
Pining for the fjords
What does it do when the translated text doesn't fit the original layout? Resize the font to an unreadable size?
Fuji Xerox's secret lies in networking the unnamed copier to a dedicated translation server and combining this with algorithms that can distinguish between text, drawings and lines for maintaining page layouts.
In other words, it's not a translation box at all. It's a networked scanner/copier that passes the scan to a server parked somewhere else to modify the page and send it back to the printer.
To call it a translator without mentioning the big box sitting across the office that goes with it, is a bit fraudulent. Reminiscent of the chess playing robot frauds of years back, that were run by a chessmaster hiding somewhere nearby running the mannequin.
I work for the Department of Redundancy Department.
This is no more amazing than taking your desktop OCR software, scanning a page, pasting the result into Google Translate, and printing the resulting page. The reason nobody is shipping this is because the translation isn't good enough yet to make it worthwhile.
Well, I mean the packing list includes a "for glue the sex rubber mat" -- so it can't possibly be that bad, could it? Actually, ew. It's an ethernet switch. I really don't want to know what that bit's for!
It sounds like they've come up with the greatest invention of all time - an ethernet switch that allows you to completely circumvent the computer and have your pr0n delivered directly to the ultimate....ah...consumer. God bless those crazy bastards.
I've played around with a lot of OCR technology and I have to say, it just doesn't work perfectly yet. Anyone looking to just get text recognized, they'd better be prepared to run it through a word processor and give it an additional proofreading. And we also know how effective babelfish is. So we're to expect that they can take raw text freshly OCR'd, run it through a babelfish workalike and get something out the other end that doesn't read like a poorly translated engrish technical manual? Or do they just have very low expectations?
Kwisatz Haderach
Sell the spice to CHOAM
This Mahdi took Shaddam's Throne
Okay, OCR a page of text. You'll probably end up with 5% typos. Now pass that through a machine translator. Laugh at the results. From a Russian cruise ship notice: Behold many whistles! Pursue life savering equipments and bang convolve across the bosoms. Flee then to the indifferent career ships whereast obediencing the orders of the vessel chef. ... and so on from 3454 web sites that collect broken English.
Someone should make a prank copier which uses OCR and replaces some words in the document you make a "copy" of... Replace all occurrences of "dear" with "esteemed yet stupid", "boss" with "monkey boy", "accounting" with "bean-counters", "engineers" with "propeller heads", and "best regards" with "Die in a fire".
Esteemed yet stupid monkey boy,
Today I discussed our finances with the bean-counters office. They stated that the propeller heads are having a problem getting enough supplies to finish the project this week. How should I approach this problem?
Die in a fire,
- Worker name
If there's anyone I hate more than stupid people, it's intellectuals.
"Please to be doing the needful"
If a Japanese receptionist sits on the copier, does it switch between portrait & landscape mode ?
Wanna fight ? Bend over, stick your head up your ass, and fight for air.
Now they can more efficiently create terrible manuals.
These guys saying that the technology won't be here within their lifetime have to be ancient or just forgetting how rapidly the pace at which technology accelerates has been increasing of late. How long ago was it that this here "Internet" only had a few hundred nodes?
I am not exaggerating when I say that automatic translation from extremely dissimilar languages requires strong AI. You need to be able to guess what a person is thinking from what they're expressing to map it into a different way of expressing themselves. You also need strong AI to understand the flow of conversation when terms are not expressed strongly.
As an example, Japanese doesn't really have a word that maps to "it." They have a word that maps well to "thing," but nothing that matches "it." This is because pronouns in English fulfill the function of referring back to a concept expressed in a previous sentence to place it in short form in the context of the sentence being expressed.
English Example:
E1: Hey Frank, did you buy that TV yet?
E2: Yeah, I bought it yesterday.
"Japanese" Example:
J1: Hey Hiro, already you that TV did buy? [or still you that TV haven't bought?]
J2: Yeah, yesterday bought. [Note the lack of "I" and "it!"]
Languages like English make translation easy in this regard because you have a generic pronoun to "hold the place" of a specific subject or object. You don't have to know what it is -- you just fill it in. In Japanese, if the conversation were to continue about the TV, the word TV would never be brought up again until the subject of a sentence changed away to something else.
Often, you can have a conversation in Japanese where the subject is never explicitly spoken because it's obvious from the context of the speakers. Given the frequency of homophones in the language (particularly in conjugated forms of verbs), this can make translation maddening if you don't know what the speakers are talking about (because you can't see what they can see or don't know what they know).
This is most frustrating when you're dealing with an author who is using ambiguous or cryptic speech by off-screen characters to give a sense of foreboding or foreshadowing. The conversation is just as cryptic to a native Japanese speaker as it is to us, but we literally cannot translate it to English without knowing the secrets ahead of time because grammatically correct English cannot be that vague!
Anyway, I'm starting to veer off from my original point which is to say that accurate translation requires modeling of the minds of the speaker which requires strong AI. A simple dictionary + grammar rule-set or even a theoretically complete database of possible sentences and phrases will never be able to achieve translation because of the inherent differences in the levels of specificity in the two languages that requires you to model and understand the thoughts and intentions of the speakers.
Frankly I've mostly given up on strong AI within my lifetime after so many decades of empty promises, so I don't see accurate automated translators coming any time soon.
A final thought:
While I've harped on the difficulty of going from Japanese to English, there are some tricky parts of going the other way -- I just don't have as much experience. The one time I wrote a letter in Japanese for a class that included words I didn't know beforehand, I ended up accidentally using words that sounded bizarre and in one case insulting because words in different languages don't map 100% to each other. A word that means the same thing in Japanese and English for one use may not mean the same thing in another. For example, you can use both "karu" and "kiru" to mean "cut" when talking about hair, but you'd use "karu" for mowing grass or shearing sheep, and "kiru" for chopping up fish and accidentally cutting your finger. The relationship between words is a Venn diagram, and computer translation gets that wrong when it's unable to realize what the (omitted) subject or object of the sentence was.
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
OP should be modded as 'Flame-bait', rather than 'Troll', then?
Vegeta, how many copies can we get per toner cartridge?
(Does anyone care to guess?)
we will have seamless translation, as we drive around in our fusion powered flying cars
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
I can't imagine this is worlds better than other translation software which is usually fraught with problems due to the different in the structure of the language. People speaking different languages natively don't even conceptualize the same way.
This sounds like an engrish generator....
This sig contains a manual self-destruct. Kindly please put your foot through your monitor in 8 seconds.
Give me voice transcription that actually works, then start worrying about translating the jabberwocky to another language.
Wildness. Fuji drew up the photocopying machine which automatically translates the document from English from Japanese. That is the clean nut. Whether with respect to appearance, as for the copier the text, as for OCR what kind of section text, you send that to the translation engine, and to the place reset English it can grasp.
Its powder level! It's over 9000!
(I tried so hard not to rise to the bait... I honestly did. Then I thought of the power/powder pun and I couldn't stop. Damn you Psykechan! Damn you!
> On the other hand, you really don't need to skip your medication to imagine an automated translator for complete documents from Japanese to English. The lack of context from sentence to sentence could be resolved the same way that humans resolve it, and it wouldn't require your AI to see what the humans are seeing or anything like that.
Well, the way humans do it revolves around their understanding of the people talking and the world around them. The information simply does not exist in the text, it's assumed, and this is very much for complete documents. A complete Japanese sentence leaves things out. It's not incomplete to a Japanese person, it's just "incomplete" for the poor translator who has to track down that information. In a book, it might be 800 pages ahead when we finally figure out what the foreshadowing meant.
Ambiguity is the rule, not the exception, in Japanese. And sometimes, you have to preserve it, especially in works of fiction. That makes it like having to translate something without being able to use the letter E...
Don't get me wrong: computers may be able to do very well translating between languages with lots of overlap (e.g. French/Spanish), but for languages like Japanese & English, they barely overlap at all. Unlike the GP posts, I'm not too worried about simple things like SVO vs. SOV; word order is by far the least of anyone's worries. And kanji can disambiguate a lot better than spoken Japanese ever will. But the fact that some things simply don't get communicated will make it untranslatable unless someone finds a way to supply all of the proper context to the translation.
That's probably not going to happen any time soon. The translators here are absolutely right about having to switch modes. I'm a mere student, but they're not exaggerating the difficulty at all.
Let's so double the killer delete select all your base are belong to us?
You have no chance to survive.
Make your time.
of wonderful creation Fuji being in hopefully market soon?
if this is supposed to be a new economy, how come they still want my old fashioned money?
Thank you very much for your order for 100.000 bottles of Pocari Sweat.
We will boat this on top of evening in 1000 cases, 10 bottles in each case.
Wishing your family will not die when they hear this news.
-- Tigger warning: This post may contain tiggers! --
8000, but the english translation will say 9000.
I don't know that it's a source that you used, but my copy of Expressive Japanese by Senko K. Maynard calls Japanese a "topic-comment" language (and English an "agent-does" language). The ISBN is 0-8248-2889-5.
:( (though I did learn that it's 'sun + origin', as per Land of the Rising Sun, from the escapade).
And I want to thank both of you for your comments; I learned a little by reading them. As you might guess, I'm one of the 8 zillion random nerds who watch anime and study Japanese in their spare time. Naturally, that means that I'm a total n00b at it thus far and am just glad to finally recognize basic kana, but I have to look up anything the tiniest bit "complex" (for me), like that sign at Hana Suki that read (as I found out after looking up the kanji) "nihon jisshoku" ("authentic japanese cuisine" for anyone else reading this).
Yeah, it's bad when I don't yet recognize common kanji combinations like "nihon" yet and wondered what the hell a 'sun tree' was
You might enjoy the following essay:
http://nihongo.3yen.com/2007-09-03/repeat-after-me-there-is-no-such-thing-as-the-subject/