![]() | This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 |
The title of this article -- "List of English words of Japanese origin" -- gives the misleading impression that these words are in sufficiently widespread and general use to be considered "English" by English speakers. In fact, only a tiny number of them fall into that category. Either the article needs renaming, or there needs to be a disclaimer at the start saying "these words are listed in standard English dictionaries, but..." I tried to do something along these lines but was reverted. Maybe my attempt was not good, but the way it reads right now cannot be correct IMO. Matt 20:43, 12 July 2007 (UTC).
(resetting indent) The three-dictionary rule is nice because it's objective. Would you want to elaborate on which dictionaries are acceptable? Or leave it open? Fg2 ( talk) 08:24, 8 February 2008 (UTC)
I'm amused to see the constant desire to see words that are jargon in specific disciplines or hobbies related to Japan added to this page. Properly call these non-anglicized words jargon, and you can't make an argument against it. Suggest that they might be established loan words, and it gives logophiles the excuse to wage dictionary wars. But call this ever-dynamic page glossary of Japanese words, and there shall be eternal peace (and additions to the list).-- 67.121.120.67 ( talk) 02:35, 12 February 2008 (UTC)
None of these words can be considered part of the English language. They are simply Japanese words transcribed into English.
The title of the article needs to be changed Globalscene ( talk) 22:00, 30 April 2008 (UTC)
Oxford,Merriam & webster seem to agree. Most of the words I've checked are either labelled Japanese terms or Japanese proper nouns.
Every word on this planet can be transcribed into English, it does not make it an English word. Globalscene ( talk) 16:38, 2 May 2008 (UTC)
Many of these words are naturalized English-language citizens (so to speak). But the list has become bloated once more; "Bokeh" is an example of a word that is still alien here. -- Orange Mike | Talk 16:40, 2 May 2008 (UTC)
This isn't a list of words in an english dictionary, many are in the dictionary but explicitly say they are Japanese. Globalscene ( talk) 00:14, 4 May 2008 (UTC)
The following words are, indisputably, words of Japanese origina whose meaning would be understood and part of the vocabulary of the average speaker of English:
That's an impressive number of words right there. All the rest should be removed. There's no need to embellish the list. As a reverse analogy, I'm sure that the words "gerrymander" or "McCarthyism" have turned up in Japanese books, but they would not be considered to be part of Japanese vocabulary. I'm not sure that the opinion will be umaminous, but the list should be trimmed down to commonly used words. Mandsford ( talk) 21:54, 4 May 2008 (UTC)
Currently, the inclusion criteria is based on the most credible and reliable third-party sources for word origins and existence: major, published dictionaries. I can't imagine that there is a better (i.e. more reliable and authoritative) source than major published English dictionaries, so I can't imagine any better criteria for inclusion. If you have a specific proposal for inclusion criteria that you believe is better than using major published English dictionaries, please present it; however, caviling about how the list is too long and giving a list of words that should be included without any explanation of an independent criterion for selecting those words is not really a tenable proposal. You Are Probably Not a Lexicologist or a Lexicographer. Nohat ( talk) 04:27, 15 May 2008 (UTC)
What about Typhoon? Thats a pretty common english word from Japanese (Taifuu) —Preceding unsigned comment added by Ottawakismet ( talk • contribs) 20:25, 4 November 2009 (UTC)
I removed 'typhoon' because it did not enter English from Japanese, but rather via Portuguese, Greek, and Chinese. Japanese "taifuu" is On'yomi, so it also came from Chinese. Ulmanor ( talk) 04:51, 5 November 2009 (UTC)
Is this the typical level of maturity displayed here? —Preceding unsigned comment added by Globalscene ( talk • contribs) 21:03, 9 May 2008 (UTC)
This is what I'm getting from this discussion. Especially these Japanese-American editors. Yes i did some background checking and alot of people here are actually Japanese. —Preceding unsigned comment added by Globalscene ( talk • contribs) 21:09, 9 May 2008 (UTC)
I wrote a small program to check various online dictionaries for whether a word is listed, and here are the results for the words on this page.
word | american_heritage | merriam_webster | random_house | websters_new_millennium | wordnet |
---|---|---|---|---|---|
adzuki | no | yes | no | yes | no |
azuki bean | yes | yes | yes | no | no |
aikai | no | no | no | no | no |
aikido | yes | yes | yes | no | no |
akita | no | yes | yes | no | no |
atsu | no | no | no | no | no |
aucuba | yes | yes | yes | no | no |
banzai | yes | yes | yes | no | yes |
bento | yes | no | yes | no | no |
bokeh | no | no | no | yes | no |
bokken | no | no | no | no | no |
bonsai | yes | yes | yes | no | yes |
bonze | yes | yes | yes | no | no |
budo | no | yes | no | yes | no |
bukkake | no | no | no | no | no |
bushido | yes | yes | yes | no | yes |
daikon | yes | yes | yes | no | no |
daimyo | yes | yes | yes | no | no |
dan | yes | yes | yes | no | no |
dashi | no | yes | yes | no | no |
dojo | yes | yes | yes | no | no |
domoic acid | yes | yes | no | no | yes |
edamame | yes | no | no | no | no |
ekiden | no | no | no | no | no |
enokitake | no | no | yes | no | no |
enoki mushroom | no | yes | no | no | no |
fugu | yes | yes | yes | no | yes |
fusuma | no | yes | yes | no | no |
futon | yes | yes | yes | no | no |
gaijin | no | yes | yes | no | no |
geisha | yes | yes | yes | no | yes |
genro | no | yes | yes | no | no |
geta | yes | yes | yes | no | yes |
ginkgo | yes | yes | yes | no | yes |
go | yes | yes | yes | no | yes |
gyokuro | no | yes | no | no | no |
gyoza | no | no | yes | no | no |
haiku | yes | yes | yes | no | yes |
hanami | no | no | no | no | no |
happi | no | no | no | no | no |
happy coat | no | no | no | no | no |
hara-kiri | yes | yes | yes | no | yes |
hentai | no | no | no | yes | no |
hibachi | yes | yes | yes | no | no |
hijiki | no | no | yes | no | no |
hikikomori | no | no | no | yes | no |
hiragana | yes | yes | yes | no | no |
honcho | yes | yes | yes | no | yes |
honcho | yes | yes | yes | no | yes |
ikebana | no | yes | yes | no | no |
imari | no | yes | no | no | no |
inro | yes | yes | yes | no | no |
judo | yes | yes | yes | no | yes |
jujutsu | yes | yes | yes | no | yes |
juku | no | no | yes | no | no |
kabuki | yes | yes | yes | no | no |
kaizen | no | no | no | yes | no |
kakemono | yes | yes | yes | no | no |
kaki | yes | yes | yes | no | no |
kakiemon | no | yes | no | no | no |
kami | yes | yes | no | no | yes |
kamikaze | yes | yes | yes | no | yes |
kana | yes | yes | yes | no | no |
kanban | no | yes | yes | no | no |
kanji | yes | yes | yes | no | no |
karaoke | yes | yes | yes | no | no |
karate | yes | yes | yes | no | yes |
karoshi | no | no | no | yes | no |
kata | yes | yes | yes | no | no |
katakana | yes | yes | yes | no | no |
katana | no | yes | no | yes | no |
katsuo | no | no | no | no | no |
katsuobushi | no | no | no | no | no |
katsura | no | yes | no | no | no |
katsuramono | no | no | no | no | no |
keiretsu | yes | yes | yes | no | no |
keirin | no | no | no | no | no |
kendo | no | yes | yes | no | no |
kimono | yes | yes | yes | no | yes |
kirigami | yes | no | yes | no | no |
koan | yes | yes | yes | no | yes |
koi | no | yes | yes | yes | no |
koji | no | yes | yes | no | no |
kombu | no | yes | yes | no | no |
koto | yes | yes | yes | no | no |
kudzu | yes | yes | yes | no | no |
makimono | no | yes | yes | no | no |
manga | no | yes | yes | no | no |
matcha | no | no | no | no | no |
matsuri | no | no | no | no | no |
matsutake | no | yes | yes | no | no |
medaka | no | yes | yes | no | no |
mikado | yes | yes | yes | no | yes |
mirin | yes | yes | no | no | no |
miso | yes | yes | yes | no | yes |
mizuna | no | yes | yes | no | no |
mochi | no | no | no | no | no |
moxibustion | yes | yes | no | no | no |
nappa, napa cabbage | no | no | no | no | no |
nashi | no | yes | yes | no | no |
netsuke | yes | yes | yes | no | no |
ninja | yes | yes | yes | no | yes |
noh | yes | yes | yes | no | no |
nori | yes | yes | yes | no | no |
nunchaku | no | yes | yes | no | no |
obi | yes | yes | yes | no | yes |
ooch | no | no | no | no | no |
origami | yes | yes | yes | no | no |
otaku | no | no | no | yes | no |
oxa | no | no | no | no | no |
pachinko | yes | yes | yes | no | no |
panko | no | no | no | no | no |
rame | no | no | no | no | no |
ramen | no | yes | yes | no | no |
randori | no | yes | no | no | no |
renga | no | no | yes | no | no |
rickshaw | yes | yes | yes | no | yes |
romaji | no | yes | yes | no | no |
ronin | no | yes | no | no | no |
roshi | no | no | yes | no | no |
sai | no | no | no | no | no |
sake | yes | yes | yes | yes | yes |
sakura | no | yes | no | no | no |
salaryman | no | yes | yes | no | no |
samurai | yes | yes | yes | no | yes |
sashimi | yes | yes | yes | no | no |
satori | yes | yes | yes | no | no |
satsuma | yes | yes | yes | no | yes |
satsuma | yes | yes | yes | no | yes |
sayonara | yes | yes | yes | no | no |
senryu | no | yes | no | no | no |
sensei | no | yes | yes | no | no |
seppuku | yes | yes | yes | no | yes |
shabu shabu | no | yes | yes | no | no |
shakuhachi | no | yes | no | no | no |
shamisen | yes | yes | no | no | no |
shiatsu | yes | yes | yes | no | yes |
shiba inu | no | yes | no | no | no |
shiitake | yes | yes | yes | no | no |
shinkansen | no | no | no | no | no |
shinto | yes | yes | yes | no | yes |
shogi | yes | yes | yes | no | no |
shogun | yes | yes | yes | no | yes |
shoji | yes | yes | yes | no | no |
shoyu | no | yes | yes | no | no |
shunga | no | no | no | no | no |
sika | yes | yes | yes | no | no |
skosh | no | yes | yes | no | no |
soba | yes | yes | yes | no | no |
soroban | no | yes | yes | no | no |
soy | yes | yes | yes | no | yes |
sudoku | yes | yes | no | yes | yes |
sukiyaki | yes | yes | yes | no | no |
sumi-e | no | no | yes | no | no |
sumo | yes | yes | yes | no | yes |
surimi | no | yes | yes | no | no |
sushi | yes | yes | yes | no | yes |
tabi | yes | no | yes | no | yes |
taiko | no | no | no | no | no |
takoyaki | no | no | no | no | no |
tamari | no | yes | yes | no | no |
tanka | yes | yes | yes | no | yes |
tanuki | no | yes | no | no | no |
tatami | no | yes | yes | no | no |
tempura | yes | yes | yes | no | yes |
tenno | no | yes | no | no | yes |
teppanyaki | no | yes | no | no | no |
teriyaki | yes | yes | yes | yes | no |
tofu | yes | yes | yes | no | yes |
tokonoma | no | yes | yes | no | no |
tokusatsu | no | no | no | no | no |
torii | no | yes | yes | no | no |
tsunami | yes | yes | yes | no | yes |
tsutsugamushi | no | yes | no | no | no |
tycoon | yes | yes | yes | no | yes |
udo | yes | yes | yes | no | no |
udon | no | yes | yes | no | no |
ukiyo-e | no | yes | yes | no | no |
umami | yes | yes | no | no | no |
umeboshi | no | no | yes | no | no |
urushiol | yes | yes | yes | no | no |
utani | no | no | no | no | no |
uzushi | no | no | no | no | no |
waka | no | yes | yes | no | no |
wakame | no | yes | yes | no | no |
wakizashi | no | no | no | no | no |
wasabi | yes | yes | yes | no | yes |
yagi | yes | yes | no | no | no |
yakitori | no | yes | yes | no | no |
yakuza | yes | no | yes | no | no |
yukata | no | no | yes | no | no |
yumi | no | no | no | no | no |
zaibatsu | no | yes | yes | no | no |
zazen | yes | yes | yes | yes | no |
zen | yes | yes | yes | no | yes |
zori | yes | yes | yes | no | no |
And for historical record, here is the program I wrote to do this:
#!/usr/bin/perl
use warnings;
use strict;
use LWP::Simple qw(get);
use Memoize;
memoize('my_get');
my %dictionary_data = (
merriam_webster => { url => 'http://www.m-w.com/dictionary/%word%',
nomatch_regexp => qr/The word you've entered isn't in the dictionary/ },
random_house => { url => 'http://dictionary.reference.com/browse/%word%',
nomatch_regexp => qr/No results found for/,
match_regexp => qr/Based on the Random House Unabridged Dictionary/ },
american_heritage => { url => 'http://dictionary.reference.com/browse/%word%',
nomatch_regexp => qr/No results found for/,
match_regexp => qr/The American Heritage/ },
wordnet => { url => 'http://dictionary.reference.com/browse/%word%',
nomatch_regexp => qr/No results found for/,
match_regexp => qr/WordNet/ },
websters_new_millennium =>
{ url => 'http://dictionary.reference.com/browse/%word%',
nomatch_regexp => qr/No results found for/,
match_regexp => qr/Webster\'s New Millennium/ }, #'
);
my @dicts = sort keys %dictionary_data;
print "{|\n|-\n! word\n";
foreach my $dict (@dicts) {
print "! $dict\n";
}
while (<>) {
chomp;
my @results = ();
print "|-\n| $_\n";
foreach my $dict (@dicts) {
if (is_in_dictionary($_, $dict)) {
print qq(| style="background:#99FF99" | yes\n);
} else {
print qq(| style="background:#FF9999" | no\n);
}
}
}
print "|}\n";
sub is_in_dictionary {
my ($word, $dictionary) = @_;
my $url = $dictionary_data{$dictionary}->{url};
$url =~ s/%word%/$word/;
my $result = my_get($url);
my $nomatch_regexp = $dictionary_data{$dictionary}->{nomatch_regexp};
my $match_regexp = $dictionary_data{$dictionary}->{match_regexp};
if ($result =~ $nomatch_regexp) {
return 0;
} else {
if (defined $match_regexp) {
return $result =~ $match_regexp ? 1 : 0;
} else {
return 1;
}
}
}
sub my_get {
my ($url) = @_;
my $result = get($url);
return $result;
}
Nohat ( talk) 05:51, 15 May 2008 (UTC)
What to do with this data? For starters, I think words which don't appear in any of the dictionaries are good candidates for culling, provided no one can show that the word appears in a dictionary other than the ones given here, such as the Oxford English Dictionary. Nohat ( talk) 05:53, 15 May 2008 (UTC)
![]() | This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 |
The title of this article -- "List of English words of Japanese origin" -- gives the misleading impression that these words are in sufficiently widespread and general use to be considered "English" by English speakers. In fact, only a tiny number of them fall into that category. Either the article needs renaming, or there needs to be a disclaimer at the start saying "these words are listed in standard English dictionaries, but..." I tried to do something along these lines but was reverted. Maybe my attempt was not good, but the way it reads right now cannot be correct IMO. Matt 20:43, 12 July 2007 (UTC).
(resetting indent) The three-dictionary rule is nice because it's objective. Would you want to elaborate on which dictionaries are acceptable? Or leave it open? Fg2 ( talk) 08:24, 8 February 2008 (UTC)
I'm amused to see the constant desire to see words that are jargon in specific disciplines or hobbies related to Japan added to this page. Properly call these non-anglicized words jargon, and you can't make an argument against it. Suggest that they might be established loan words, and it gives logophiles the excuse to wage dictionary wars. But call this ever-dynamic page glossary of Japanese words, and there shall be eternal peace (and additions to the list).-- 67.121.120.67 ( talk) 02:35, 12 February 2008 (UTC)
None of these words can be considered part of the English language. They are simply Japanese words transcribed into English.
The title of the article needs to be changed Globalscene ( talk) 22:00, 30 April 2008 (UTC)
Oxford,Merriam & webster seem to agree. Most of the words I've checked are either labelled Japanese terms or Japanese proper nouns.
Every word on this planet can be transcribed into English, it does not make it an English word. Globalscene ( talk) 16:38, 2 May 2008 (UTC)
Many of these words are naturalized English-language citizens (so to speak). But the list has become bloated once more; "Bokeh" is an example of a word that is still alien here. -- Orange Mike | Talk 16:40, 2 May 2008 (UTC)
This isn't a list of words in an english dictionary, many are in the dictionary but explicitly say they are Japanese. Globalscene ( talk) 00:14, 4 May 2008 (UTC)
The following words are, indisputably, words of Japanese origina whose meaning would be understood and part of the vocabulary of the average speaker of English:
That's an impressive number of words right there. All the rest should be removed. There's no need to embellish the list. As a reverse analogy, I'm sure that the words "gerrymander" or "McCarthyism" have turned up in Japanese books, but they would not be considered to be part of Japanese vocabulary. I'm not sure that the opinion will be umaminous, but the list should be trimmed down to commonly used words. Mandsford ( talk) 21:54, 4 May 2008 (UTC)
Currently, the inclusion criteria is based on the most credible and reliable third-party sources for word origins and existence: major, published dictionaries. I can't imagine that there is a better (i.e. more reliable and authoritative) source than major published English dictionaries, so I can't imagine any better criteria for inclusion. If you have a specific proposal for inclusion criteria that you believe is better than using major published English dictionaries, please present it; however, caviling about how the list is too long and giving a list of words that should be included without any explanation of an independent criterion for selecting those words is not really a tenable proposal. You Are Probably Not a Lexicologist or a Lexicographer. Nohat ( talk) 04:27, 15 May 2008 (UTC)
What about Typhoon? Thats a pretty common english word from Japanese (Taifuu) —Preceding unsigned comment added by Ottawakismet ( talk • contribs) 20:25, 4 November 2009 (UTC)
I removed 'typhoon' because it did not enter English from Japanese, but rather via Portuguese, Greek, and Chinese. Japanese "taifuu" is On'yomi, so it also came from Chinese. Ulmanor ( talk) 04:51, 5 November 2009 (UTC)
Is this the typical level of maturity displayed here? —Preceding unsigned comment added by Globalscene ( talk • contribs) 21:03, 9 May 2008 (UTC)
This is what I'm getting from this discussion. Especially these Japanese-American editors. Yes i did some background checking and alot of people here are actually Japanese. —Preceding unsigned comment added by Globalscene ( talk • contribs) 21:09, 9 May 2008 (UTC)
I wrote a small program to check various online dictionaries for whether a word is listed, and here are the results for the words on this page.
word | american_heritage | merriam_webster | random_house | websters_new_millennium | wordnet |
---|---|---|---|---|---|
adzuki | no | yes | no | yes | no |
azuki bean | yes | yes | yes | no | no |
aikai | no | no | no | no | no |
aikido | yes | yes | yes | no | no |
akita | no | yes | yes | no | no |
atsu | no | no | no | no | no |
aucuba | yes | yes | yes | no | no |
banzai | yes | yes | yes | no | yes |
bento | yes | no | yes | no | no |
bokeh | no | no | no | yes | no |
bokken | no | no | no | no | no |
bonsai | yes | yes | yes | no | yes |
bonze | yes | yes | yes | no | no |
budo | no | yes | no | yes | no |
bukkake | no | no | no | no | no |
bushido | yes | yes | yes | no | yes |
daikon | yes | yes | yes | no | no |
daimyo | yes | yes | yes | no | no |
dan | yes | yes | yes | no | no |
dashi | no | yes | yes | no | no |
dojo | yes | yes | yes | no | no |
domoic acid | yes | yes | no | no | yes |
edamame | yes | no | no | no | no |
ekiden | no | no | no | no | no |
enokitake | no | no | yes | no | no |
enoki mushroom | no | yes | no | no | no |
fugu | yes | yes | yes | no | yes |
fusuma | no | yes | yes | no | no |
futon | yes | yes | yes | no | no |
gaijin | no | yes | yes | no | no |
geisha | yes | yes | yes | no | yes |
genro | no | yes | yes | no | no |
geta | yes | yes | yes | no | yes |
ginkgo | yes | yes | yes | no | yes |
go | yes | yes | yes | no | yes |
gyokuro | no | yes | no | no | no |
gyoza | no | no | yes | no | no |
haiku | yes | yes | yes | no | yes |
hanami | no | no | no | no | no |
happi | no | no | no | no | no |
happy coat | no | no | no | no | no |
hara-kiri | yes | yes | yes | no | yes |
hentai | no | no | no | yes | no |
hibachi | yes | yes | yes | no | no |
hijiki | no | no | yes | no | no |
hikikomori | no | no | no | yes | no |
hiragana | yes | yes | yes | no | no |
honcho | yes | yes | yes | no | yes |
honcho | yes | yes | yes | no | yes |
ikebana | no | yes | yes | no | no |
imari | no | yes | no | no | no |
inro | yes | yes | yes | no | no |
judo | yes | yes | yes | no | yes |
jujutsu | yes | yes | yes | no | yes |
juku | no | no | yes | no | no |
kabuki | yes | yes | yes | no | no |
kaizen | no | no | no | yes | no |
kakemono | yes | yes | yes | no | no |
kaki | yes | yes | yes | no | no |
kakiemon | no | yes | no | no | no |
kami | yes | yes | no | no | yes |
kamikaze | yes | yes | yes | no | yes |
kana | yes | yes | yes | no | no |
kanban | no | yes | yes | no | no |
kanji | yes | yes | yes | no | no |
karaoke | yes | yes | yes | no | no |
karate | yes | yes | yes | no | yes |
karoshi | no | no | no | yes | no |
kata | yes | yes | yes | no | no |
katakana | yes | yes | yes | no | no |
katana | no | yes | no | yes | no |
katsuo | no | no | no | no | no |
katsuobushi | no | no | no | no | no |
katsura | no | yes | no | no | no |
katsuramono | no | no | no | no | no |
keiretsu | yes | yes | yes | no | no |
keirin | no | no | no | no | no |
kendo | no | yes | yes | no | no |
kimono | yes | yes | yes | no | yes |
kirigami | yes | no | yes | no | no |
koan | yes | yes | yes | no | yes |
koi | no | yes | yes | yes | no |
koji | no | yes | yes | no | no |
kombu | no | yes | yes | no | no |
koto | yes | yes | yes | no | no |
kudzu | yes | yes | yes | no | no |
makimono | no | yes | yes | no | no |
manga | no | yes | yes | no | no |
matcha | no | no | no | no | no |
matsuri | no | no | no | no | no |
matsutake | no | yes | yes | no | no |
medaka | no | yes | yes | no | no |
mikado | yes | yes | yes | no | yes |
mirin | yes | yes | no | no | no |
miso | yes | yes | yes | no | yes |
mizuna | no | yes | yes | no | no |
mochi | no | no | no | no | no |
moxibustion | yes | yes | no | no | no |
nappa, napa cabbage | no | no | no | no | no |
nashi | no | yes | yes | no | no |
netsuke | yes | yes | yes | no | no |
ninja | yes | yes | yes | no | yes |
noh | yes | yes | yes | no | no |
nori | yes | yes | yes | no | no |
nunchaku | no | yes | yes | no | no |
obi | yes | yes | yes | no | yes |
ooch | no | no | no | no | no |
origami | yes | yes | yes | no | no |
otaku | no | no | no | yes | no |
oxa | no | no | no | no | no |
pachinko | yes | yes | yes | no | no |
panko | no | no | no | no | no |
rame | no | no | no | no | no |
ramen | no | yes | yes | no | no |
randori | no | yes | no | no | no |
renga | no | no | yes | no | no |
rickshaw | yes | yes | yes | no | yes |
romaji | no | yes | yes | no | no |
ronin | no | yes | no | no | no |
roshi | no | no | yes | no | no |
sai | no | no | no | no | no |
sake | yes | yes | yes | yes | yes |
sakura | no | yes | no | no | no |
salaryman | no | yes | yes | no | no |
samurai | yes | yes | yes | no | yes |
sashimi | yes | yes | yes | no | no |
satori | yes | yes | yes | no | no |
satsuma | yes | yes | yes | no | yes |
satsuma | yes | yes | yes | no | yes |
sayonara | yes | yes | yes | no | no |
senryu | no | yes | no | no | no |
sensei | no | yes | yes | no | no |
seppuku | yes | yes | yes | no | yes |
shabu shabu | no | yes | yes | no | no |
shakuhachi | no | yes | no | no | no |
shamisen | yes | yes | no | no | no |
shiatsu | yes | yes | yes | no | yes |
shiba inu | no | yes | no | no | no |
shiitake | yes | yes | yes | no | no |
shinkansen | no | no | no | no | no |
shinto | yes | yes | yes | no | yes |
shogi | yes | yes | yes | no | no |
shogun | yes | yes | yes | no | yes |
shoji | yes | yes | yes | no | no |
shoyu | no | yes | yes | no | no |
shunga | no | no | no | no | no |
sika | yes | yes | yes | no | no |
skosh | no | yes | yes | no | no |
soba | yes | yes | yes | no | no |
soroban | no | yes | yes | no | no |
soy | yes | yes | yes | no | yes |
sudoku | yes | yes | no | yes | yes |
sukiyaki | yes | yes | yes | no | no |
sumi-e | no | no | yes | no | no |
sumo | yes | yes | yes | no | yes |
surimi | no | yes | yes | no | no |
sushi | yes | yes | yes | no | yes |
tabi | yes | no | yes | no | yes |
taiko | no | no | no | no | no |
takoyaki | no | no | no | no | no |
tamari | no | yes | yes | no | no |
tanka | yes | yes | yes | no | yes |
tanuki | no | yes | no | no | no |
tatami | no | yes | yes | no | no |
tempura | yes | yes | yes | no | yes |
tenno | no | yes | no | no | yes |
teppanyaki | no | yes | no | no | no |
teriyaki | yes | yes | yes | yes | no |
tofu | yes | yes | yes | no | yes |
tokonoma | no | yes | yes | no | no |
tokusatsu | no | no | no | no | no |
torii | no | yes | yes | no | no |
tsunami | yes | yes | yes | no | yes |
tsutsugamushi | no | yes | no | no | no |
tycoon | yes | yes | yes | no | yes |
udo | yes | yes | yes | no | no |
udon | no | yes | yes | no | no |
ukiyo-e | no | yes | yes | no | no |
umami | yes | yes | no | no | no |
umeboshi | no | no | yes | no | no |
urushiol | yes | yes | yes | no | no |
utani | no | no | no | no | no |
uzushi | no | no | no | no | no |
waka | no | yes | yes | no | no |
wakame | no | yes | yes | no | no |
wakizashi | no | no | no | no | no |
wasabi | yes | yes | yes | no | yes |
yagi | yes | yes | no | no | no |
yakitori | no | yes | yes | no | no |
yakuza | yes | no | yes | no | no |
yukata | no | no | yes | no | no |
yumi | no | no | no | no | no |
zaibatsu | no | yes | yes | no | no |
zazen | yes | yes | yes | yes | no |
zen | yes | yes | yes | no | yes |
zori | yes | yes | yes | no | no |
And for historical record, here is the program I wrote to do this:
#!/usr/bin/perl
use warnings;
use strict;
use LWP::Simple qw(get);
use Memoize;
memoize('my_get');
my %dictionary_data = (
merriam_webster => { url => 'http://www.m-w.com/dictionary/%word%',
nomatch_regexp => qr/The word you've entered isn't in the dictionary/ },
random_house => { url => 'http://dictionary.reference.com/browse/%word%',
nomatch_regexp => qr/No results found for/,
match_regexp => qr/Based on the Random House Unabridged Dictionary/ },
american_heritage => { url => 'http://dictionary.reference.com/browse/%word%',
nomatch_regexp => qr/No results found for/,
match_regexp => qr/The American Heritage/ },
wordnet => { url => 'http://dictionary.reference.com/browse/%word%',
nomatch_regexp => qr/No results found for/,
match_regexp => qr/WordNet/ },
websters_new_millennium =>
{ url => 'http://dictionary.reference.com/browse/%word%',
nomatch_regexp => qr/No results found for/,
match_regexp => qr/Webster\'s New Millennium/ }, #'
);
my @dicts = sort keys %dictionary_data;
print "{|\n|-\n! word\n";
foreach my $dict (@dicts) {
print "! $dict\n";
}
while (<>) {
chomp;
my @results = ();
print "|-\n| $_\n";
foreach my $dict (@dicts) {
if (is_in_dictionary($_, $dict)) {
print qq(| style="background:#99FF99" | yes\n);
} else {
print qq(| style="background:#FF9999" | no\n);
}
}
}
print "|}\n";
sub is_in_dictionary {
my ($word, $dictionary) = @_;
my $url = $dictionary_data{$dictionary}->{url};
$url =~ s/%word%/$word/;
my $result = my_get($url);
my $nomatch_regexp = $dictionary_data{$dictionary}->{nomatch_regexp};
my $match_regexp = $dictionary_data{$dictionary}->{match_regexp};
if ($result =~ $nomatch_regexp) {
return 0;
} else {
if (defined $match_regexp) {
return $result =~ $match_regexp ? 1 : 0;
} else {
return 1;
}
}
}
sub my_get {
my ($url) = @_;
my $result = get($url);
return $result;
}
Nohat ( talk) 05:51, 15 May 2008 (UTC)
What to do with this data? For starters, I think words which don't appear in any of the dictionaries are good candidates for culling, provided no one can show that the word appears in a dictionary other than the ones given here, such as the Oxford English Dictionary. Nohat ( talk) 05:53, 15 May 2008 (UTC)