Flammie

Flammie A Pirinen on github pages


Project maintained by flammie Hosted on GitHub Pages — Theme by mattgraham

Flammie on longest words

Popularised linguistics often has factoids about what is longest word in any given language. The question interests people while it seems to produce funny results. Unfortunately, in reality the answer is very boring. Whenever there is a systematic way to combine words or morphs together, the limit of how long a word is, is based on how long word we bother to write. Even in English language, that is notoriously bad at combining such word parts (or good at adding spaces in such combinations), has some. The common sources for infinitely long words come from scientific applications, e.g. need to encode to encode chemical formulas or just numbers as words.

Of course in the world of factoid hunting, one can always say that these are not real or acceptable words for some arbitrary reason. Thereafter one must look into what is actually in use, this is something we linguists know very well as corpus research, and as computational linguists we can do corpus search of everything in seconds. Unfortunately even so we usually end up with boring answers that are not acceptable fun factoids, for example, in wikipedia articles about longest words that are antidisestablishmentiarisms or some such, or colourful coats that are red-green-blue-yellow-white-black-… just boring.

English

In English chemistry, and related fields, probably particularly organic chemistry (I am not a chemist), you can describe like proteins and complex poly by putting the components together as a word; they usually contain components like numeric prefixes bi, tri, but also numbering systems like meth for one, meth for two etc. and interleave those with -yle, -ene, -in, etc. Or, just look at IUPAC nomenclature of organic chemistry (Wikipedia). Incidentally found this on: Titin#Linguistic significance (Wikipedia).

Most other languages

In awfully many languages, there is a concept of compound words, or writing words together to make new words. This means that if you have a salesman selling X’s, you come up with Xsalesman, no matter X. As you can notice from the word salesman in English, it also has some combining, namely sales and man has become a compound, but for example soap salesman will often be written with a space, other languages will more systematically not. Now where it gets boring is the longest word competition, since numbers are commonly compounded like this. The longest word in any language would be, for example reading out one third in decimal number form: 0,333333… The longest form depends on where to cut it again, that is, how many threes we compound together before we get bored. The system for large numbers by the way, is quite similar to chemistry; numbers from 1-9 have their own words in languages, and also multipliers 10, 100, 1000. Then 10.000 and 100.000 are just 10×1000 and 100×1000. Then comes millions, which follow a new system: every thousand multiplier we formulate new number by a prefix mi-, bi-, tri-, etc. and then -illion, and in some languages alternating with -illiard. I haven’t checked with the standards but I assume the prefix system loops back to millions itself, being infinite.

References

I kind of started writing this article before viewing this youtube video, but got up to finishing it because seeing it and he explains it very neatly: