The article seems to think that a word is untranslateable if there is no single word in the target language. If I'm not misreading the article, then this is completely obvious -- just consider the number of words in English and the number of words in almost any other language, and you will find that there are more English words than the other language. It is now clear that there exist English words that don't correspond to a single word in the other language.
> It is now clear that there exist English words that don't correspond to a single word in the other language.
But that's true of any language. Not only that, but English uses loanwords heavily which are often Anglicisations of words from other languages, which may not in themselves be just one word.
"Ho ho ho", the flag-waving Little Englander types say, "Gaelic is such a stupid language, they don't even have a word for 'television', they just say 'television' in a stupid accent!"
But English also has no word for "television". Worse, the word "television" isn't even just a loanword, it's two words from two different languages, "tele" from Greek and "vision" from Latin. What a bodge job! Imagine letting something like that slip through to production use!
The hypothetical Catalan-Hungarian inventor of it in another leg of the trousers of time may have called it llunylátás, and then where would we be?
Well, most languages would have some variant of that word to mean "television", as they do now, I expect.
The English word "galore" (meaning "sufficient" shading towards "more than enough") comes from the Gaelic words "gu leòr", (goo lyaawr, the grave accent above the o makes the vowel sound longer). What a silly language English is, doesn't have a word that means "more than you're ever likely to need", has to steal one from Gaelic and then spell it wrong.
Oh, they use this word "whisky". You know what that means? It means "uisge beatha" but they only say the first word, in a silly accent because they can't pronounce it properly.
Quite often there's no single word for a thing you're trying to translate but that doesn't mean it's untranslateable. English has only one single word for rain, for example, but Gaelic has about half a dozen of which the only ones I can reproduce here are "uisge" (that word again) which just means "water", and "fras" which is more like a gentle shower. The rest of the words in the Gaelic of the North-West of Scotland that refer to rainy weather are, of course, profane in the extreme.
"English also has no word for "television" Oh goodness sake. OF COURSE English has a word "television". The fact that you can trace its etymology back to Greek and Latin doesn't mean it's not an English word. If you confronted a native speaker of Latin who also spoke Greek (a common situation back then, also vice versa), they would have no clue what "television" meant any more than most people would know what a "Fernseher" is.
I found the word τηλαυγής (telauges), "far-shining", meaning "visible from far away". Like a lighthouse. So some theoretical ancient might hear "television" and understand it as "looking at distant landmarks".
Now here's where it gets interesting: there is no agreed-upon definition among experts what a word is. So there's no point in arguing about it if the thing we're arguing about doesn't even have a rigorous definition.
Not to mention that the English dictionary is stuffed with legacy words that no natives understand. Is it even part of the language if no native use it? It's another debate.
It kind of is a proof if we assume that single words can be translated at all. Translate a single word from Language X (more words) to language Y (fewer words) and back. I can't uniquely recover all the words in Language X that way.
I don't know about that. For many practical purposes probably not?
I'm just on the thread following this idea: "The article seems to think that a word is untranslateable if there is no single word in the target language"
So we're talking about "translatability" of single words. Mapping multiple words of language X to one word of language Y is going to have some effect on translation.
That is the crux of the article premise: each synonym conveys similar denotations (principle component is I think what the article called it), but usually with some difference in connotations (the off axis contributions). You can nudge the languages vectors towards each other by adding enough synonyms and modifiers together, but they are always a little bit off even still
So, really, this can be simplified to the question "can written text fully convey all human concepts", some of which having labels in only some languages, which is an obvious "no".
I thought there was a difference between those two in how they start burning, like one needs an external flame to start while the other can burst into flame without an obvious ignition source.
But what joyful means to you likely differs from what it means to me, simply because we haven’t read the exact same literature and had the same conversations.
True, but many languages now have words that were absent from their earlier vocabularies. Shakespeare did not have the option to use 'telephone', 'semiconductor' or 'entropy'.
Not sure this approach really accounts for the difference between a language like German where you have one compound word for a concept that would require multiple words in English. For one good example, the German "Nomenkompositum" is "compound noun" in English.
Some giant portion of English vocabulary actually are compound words. English loves using compound words but only if the roots are sourced from Latin or Greek: words like electrocardiogram ("electronic heart picture", sourced from Greek), agriculture ("field nurturing", from Latin), and telecommunication ("far sharing", a hybrid of Latin and Greek roots). Probably the overwhelming majority of the words in an English dictionary will be compound words, and people regularly coin neologisms ("new words") using this formula.
An English speaker might be willing to accept componoma ("names placed together", Latin) or synthetonoma (also "names placed together", Greek) without breaking stride.
I wasn’t saying there are no compound nouns in English at all. If you count portmanteau words like “Brexit” and jargon there are a massive abundance of them. All I was saying is the approach would count certain concepts as untranslatable when they clearly aren’t, simply because in one language you have a compound word and in the other language you use several words to express the same concept. It’s definitely not untranslatable but the translation function isn’t one to one.
I think your point basically asks the question "what counts as a word" because clearly German has infinitely more "words" than would ever appear individually in a dictionary. I'm saying that English does, too.
A couple of ape cubs who learned sign language saw a duck and invented "waterbird". We have to know two dead languages to know if aquaplaning or hydroplaning is the right word.
> English loves using compound words but only if the roots are sourced from Latin or Greek: words like electrocardiogram
This is false; English loves using compound words. One example of such a compound word is "fire department", which has identical syntax to the German compound "Feuerwehr". Whether a compound word is spelled with or without internal spaces is not a fact about the language, it's a fact about the spelling.
If you ignore the spaces, the only real difference between German and English compound nouns are the infixes between elements to show bracketing. Case in point: Nomenkompositum
It's the same structure in both languages. Just because it's written as if it were a single unbreakable word doesn't mean it is--or contrariwise, the fact that it's written as two things with a space in between doesn't mean that it's two "words" in English. The problem lies in the meaning of "word." Is 'doghouse' one word in English, while 'dog house' is two? No.
That's just a difference in orthography. English could easily have had an orthographic standard where we write "compoundnoun" for compounds. This is in contrast with a language like French, where compound nouns are relatively rare. Compare English "Olive oil" and German "Olivenöl" with French "huile d'olive". In French you need to have a preposition to combine the two nouns, whereas English and German do noun-noun composition.
You are right but neither yours nor those of the previous posters are good examples of compound nouns.
These examples have just the meanings of a noun + adjective or of a noun + noun in genitive case, where some languages are lazier than others and omit the markers of case or of adjectival derivation from noun, which are needed in more strict languages.
There are also other kinds of compound nouns, where the compound noun does not have the meaning of its component words, but only some related meaning (usually either a pars pro toto meaning or a metaphorical meaning). Those are true compound nouns, not just abbreviated sequences of words from which the grammatical markers have been omitted.
Such compound words were very frequent in Ancient Greek, from where they have been inherited in the scientific and technical language, where they have been used to create names for new things and concepts, e.g. arthropod, television, phonograph, basketball, "bullet train" and so on.
This kind of compound words are almost never translatable, but they are frequently borrowed from one language to another and during the borrowing process sometimes the component words are translated, but the result is not a translated word, it is a new word that is added to the destination language.
> There are also other kinds of compound nouns, where the compound noun does not have the meaning of its component words, but only some related meaning (usually either a pars pro toto meaning or a metaphorical meaning)
The example that people often quote from German is “kummerspeck” which would literally translate as “grief bacon”, but means weight you put on through comfort eating having gone through a bereavement or other trauma.
> There are also other kinds of compound nouns, where the compound noun does not have the meaning of its component words
Wouldn't cranberry morphemes be good examples this type of relationship? I don't know if, in the eponymous example, the cran- being bound precludes it from being counted as a closed compound word or not though.
> and you will find that there are more English words than the other language. It is now clear that there exist English words that don't correspond to a single word in the other language.
You're forgetting about synonyms. The common adage that English has the largest vocabulary stems from the fact that it often has multiple words for the same thing. Sofa, couch. Autumn, fall. Etc etc. Other languages generally don't do this. I've never heard anyone suggest that English has words for more concepts.
Sofa and couch are only interchangeable in some contexts. They are different “flavors” of similar ideas.
This becomes immediately apparent (and relevant) when writing fiction or poetry. At least it does to me.
Non-fiction and spoken English do not highlight the subtleties between these words because using them interchangeably in the same work is considered bad form.
There are relatively few cases of true synonyms in English (or any language). There are subtle differences in meaning, register, etc that are recognized by native speakers.
I don't what you mean by a "true" synonym, but that is false. There are historical reasons there is a lot of word-doubling. The fact that synonyms might carry additional subtle connotations -- i.e. maybe you find "autumn" more poetic than "fall" -- doesn't change the fact that they are synonyms.
One of the main points of language is to evoke ideas in others' minds. "Autumn" and "fall" will normally evoke different ideas in US english (the former bringing about views of the cozier parts of the season, and the latter being more sterile, used to refer to a particular region of time). Maybe we disagree about what a "true synonym" is, but that distinction seems important to me.
But perhaps all languages have a countably infinite number of words, in which case that proof doesn't work. (In English we have: legless, leglessness, leglessnessless, leglessnesslessness, ... It's not a great example, but it's good enough.)
Even if the number of words in a language were finite we wouldn't have a reasonable way of counting them. There are too many kinds of fuzziness involved in deciding what counts as a "word" and you can't ignore the borderline cases because the borderline cases vastly outnumber the straightforward cases.
"...there are more English words than the other language" There might be more words in some English dictionaries than in some dictionaries of other languages, but that may just be due to a lot more effort having gone into English lexicography than X language lexicography. I doubt that most native speakers of English know more English words than equally educated native speakers of some other language know words of their language.
There's a real irony that the examples are coming from Japanese since it is an agglutinative language.
I think people don't realize how weird language is. Like you could look at Chinese and call each sentence a "word" as there are no spaces. What's the difference between that and a compound word like "nighttime" or the whole German language where you got words like Krankenwagen ("patient" + "car").
Now this doesn't mean there aren't words or phrases that aren't translatable. But the thing is we can always translate the words themselves. What we can't always translate is the meaning behind them. I think the best example of this comes from Star Trek and the Tamarian Language[0,1]. "Sokath, his eyes open!" The problem with communication is not that the words don't translate, it is that the meaning behind them doesn't. Just as people struggle with idioms when learning American English or why someone might be confused about why someone "shit in the milk" or "fucked the dog". Words are an embedding. A compression.
The thing people are constantly forgetting, but is more important than ever in a globally connected world, is that words are not perfect representations of thoughts. We compress our thoughts into them and hope the person on the other side can decompress them. It is why you can more easily communicate with your close friends who have better context than you can with another person that natively speaks your language and is why someone that learns a new language can speak perfectly well but still struggle to communicate. Language is not just words, it is culture[2]. So in a much more connected world today we have these disconnects in culture and thus interpretation of what people say. I know every one of you has been told to "speak to your audience" but how do you speak to your audience when your audience is everybody and when you don't know who your audience is? The new paradigm requires us to be much better interpreters than we were before. Least everyone is going to sound crazy, other than those you frequently talk to and have that shared understanding.
[2] This is, btw, why people argue for embodied AI being so critical. Not because LLMs can't appear to grasp the language, but because we as humans have embodied our language so deeply you probably didn't even realize that I used the word "grasp" to refer to an abstract concept and not something you can actually touch with your hand.
In another blog post where he uses "shibui" as an example of an untranslatable word, he says, "Saying shibui like that, in a mere second, conveys what would otherwise make a clunky and unnecessarily long digression."
At the root of nearly all the blog posts like this one (basically explaining why they don't agree with a widely held belief) is a redefinition of a term or word into something very specific that contradicts the common definition.
Yeah, I was interpreting 'untranslatable' to mean what it says, but they meant 'untranslatable with only a couple words', which is a very different claim.
I think a succinct way to describe my thoughts on linear algebra/language is that language has high dimensionality (ie many different basis vectors that may not necessarily be orthogonal) and that individual languages use a unique coordinate system to express thought. Each language is a lossy approximation of all conceivable thought and some languages can more efficiently represent the “all thoughts” vector space because they have basis vectors that point in more uncommon directions (like the go to japan example). So while you can more or less point to any thought in any language, some thoughts are easier to express in certain languages, which the post (and me) agree to be untranslatable words.
I tried to find the really interesting article about language and color that describes how some cultures use different naming schemes for colors but couldn’t find it. It talked about how back in the day we don’t know orange as a color, we just thought it was red-yellow and only after the fruit was distributed did the word for the color catch on. Here’s the best article I can find that talks about this phenomena https://burnaway.org/magazine/blue-language-visual-perceptio...
Each language is a lossy approximation of all conceivable thought...
This ultimately boils down to the private language discussion started by Wittgenstein. If you admit public language is a lossy approximation of meaning, you're taking a position on the existence of private languages.
> Each language is a lossy approximation of all conceivable thought
I'm not quite sure I understand this—I do have mental sensations/processes sans language, but I would not characterize them as "thoughts". To me, a thought is inherently linguistic, even if they relate to non-linguistic mental processes. So to me, learning a new language is very literally learning how to think differently.
I think we’re in agreement, but I’m afraid I don’t have the philosophical language to precisely pin my mental model into words (what a meta conundrum lol). I’ll try my best here, but I may come back in a few days with an edit if I can more coherently write my ideas.
I take a slightly more narrow definition of “thoughts” that may be more akin to “expressions” - ideas that can be communicated, so excluding non-linguistic mental processes. I think that may be where we disconnect. A lot of my idea about thoughts comes from the Borges story, Funes the memorius (short story about a dude who could not forget - interesting read and really clarifies my feelings on my definition of “all possible thought”). In the story he talks about tree leaves, but instead imagine needing a unique linguistic scheme for every single unique snowflake you ever see. It would be a linguistic nightmare! Therefore language must generalize otherwise it becomes noncommunicable and that generalization to me induces the “lossy approximation” I attribute to language in my prior comment.
So, in my head Funes’s mind represent the abstract space of all possible thoughts. When we use language, we are stacking words/sentences/paragraphs/etc together almost like vector addition trying to reach a particular point in the thought vector space. Some languages have really clean ways of getting to certain thoughts while others take a mouthful and still don’t get you exactly there (物の哀れ example from link).
I agree with your statement on new languages being different thinking. As you follow that vector addition process to get to the “thought,” different languages will take you on different paths to get to your destination thought because languages encode those vectors differently, even if the destination thought is the same. In my mental model, the act of thinking is putting those language vectors together and tracing their path to get to your thought.
And if my comment still makes no sense - I might have to incubate this thought a bit more :) but I do recommend the story- it’s a quick, thought provoking read.
> I take a slightly more narrow definition of “thoughts” that may be more akin to “expressions” - ideas that can be communicated, so excluding non-linguistic mental processes.
I was glad to read this because it seemed too neat and tidy for "thought" to necessarily be able to be encoded into language, especially in the presence of frequent miscommunication between people that share language, culture, and context.
On language and thinking, I agree that new languages promote thinking differently. But it seems that the difference has to fall short of the Sapir-Whorf hypothesis of informing perception or experience. Which would then limit the extent to which thought, as informed by language, would influence the way one would compose a linguistic representation of some thought/idea/"blob of meaning to be communicated." All to suggest that there is a broader landscape of "thinking-like activities" than those which would be able to be encoded linguistically.
Maybe it's simpler to say that I think of language as more lossy than thought.
I would argue that you can consider those thoughts. But this is the difficult bit, I've had the experience before of thoughts/feelings whatever ypu want to call them where words fall short. Knowing multiple languages helps a bit but it still falls short sometimes (very rarely).
Language is very effective at this, but I don't think thought is inherently linguistic.
To me language is just a way to label, group or organise these things. So when you learn a new one you learn a new 'labeling system/taxonomy' does that sound familiar?
> If the mere sight of the above is like a punch in the face for you, don't worry. I'm not going to math you to death in what follows. I will only remind you of a tiny basic part of it that I think relates to languages.
Yes, that mathematical expression is like a punch in my face, but not for the reason you think. I am offended that the rank of the matrix does not match the dimension of the matrix, not that I'm seeing a matrix.
Interestingly enough for this morning's walk I was musing over the tension between the hypotheses that: 'LLMs can map between languages in the vector space' (thus languages are ~equivalent); and 'Language affects thoughts' (as in German is good for Philosophy and English for getting things done).
If both these thoughts are true, then it would appear that languages have topological characteristics. We can (topologically) map from one to another, 'thoughts' (that is a complex of words) form 'paths on the language manifold' and certain paths may be more 'natural' in one topological form than the other.
My take is the human brain learns concepts primarily through differentiation. To a newly born child who has no concept of door or a wall, has no reason to see the two as being different parts. Different languages form different differentiations, but one can always compound concepts, and differentiate them differently.
To extend: there will also be general alignment tendencies towards those readily mapped and expressed concepts within available language. Hard but useful concepts can get mapped to idioms. Modes of categorization will be influenced by these factors, which in turn influences many processes.
What is a word in one language is a collocated words in another, possibly context-dependent.
We can look no further than English: "man can do something," "man can do not do something" (i.e., can do but does not have to), then pretty straightforward "man can not do something" and, all of sudden, to express that man cannot decline some obligation, we say "man can not help but do."
It is not translation per se, but shows that some parts of language were evolved to tiptoe around non-customary things, in this case, double negation. And double negation is very easy in some other languages.
"'Language affects thoughts' (as in German is good for Philosophy and English for getting things done). If both these thoughts are true..." Well, the second one isn't true (I omitted the first one in this reply). It is simply not the case that German is good for philosophy and English for getting things done, and similarly for most other such claims (French is better for talking about love, Italian is better for operas, etc. etc.).
My personal analogy, useful in my early days: Translating is like finding a vector in another space that points in the same direction or carries a similar magnitude of meaning.
In other words:
The source sentence is a vector in “language A space.”
The target sentence is a vector in “language B space.”
A good translation finds a vector that has the same direction (same meaning, intent, tone) even though it lies in a different coordinate system (the new language).
I know what you mean, but semantics is about relative positions of points in a given space. Comparing two points from two different spaces is apples and oranges. I feel like this analogy should be salvageable with a small tweak, however.
> a gentle, poignant sadness or pathos felt in response to the transient nature of all things, a deep awareness of their impermanence that evokes a subtle, bittersweet sorrow and a profound, quiet empathy for their passing.
> I hope these rather unorthodox leaps between linguistics and mathematics helped make it almost obvious that some words and ideas are untranslatable in practice. I also hope you don't take the analogy too seriously, because it won't go much further than this.
The word 'word' is polysemous, and also vague. Polysemous: it can refer to written words or spoken words (or signed words, in sign languages).
Vague: does it mean all the inflected forms of a word, or just the stem without inflection? Example: is (are?) 'walk', 'walked', 'walks' and 'walking' four words, or one? What about "stand/stood"? (And languages where the bare root or stem can't appear by itself, like verbs in Spanish.) Derived words, like 'push' and 'pushy'.
Do compound nouns count as a word, or do only the parts count? Example: 'doghouse' (or 'dog house'). What about idioms? Example: 'to crane his/her/my/your/our neck(s)'.
What about different pronunciations? Is 'roof' pronounced to rhyme with 'aloof' the same word as 'roof' pronounced with the vowel of 'put'? And different spellings but same pronunciation: 'bear' vs. 'bare'.
What about words with different grammatical categories, like 'push' as a noun ("I gave her a push") or a verb ("I pushed her"). Or the same word with virtually unrelated meanings, "I pushed her on the swing" vs. "I pushed my ideas."
上京 jōkyō is less a regular word and more a form of written shorthand. It would not be used in speech, and even in writing it's ambiguous: not only does 京 kyō simply mean "capital", but there's a district of Kyoto that's also 上京, only read Kamigyō.
The closest English equivalent is abbreviations like "PC". They're perfectly usable in context, but if you see one standing alone it's not clear if it's personal computer, politically correct, Peace Corps, etc etc.
It's a tenuous analogy, but if you along with it, you can take it further.
You could consider the "cost" of expressing a word as some kind of metric or norm on the vector. What in one language/basis is a simple Kronecker delta, in another is a very complex vector (of course if it were the same vector in two bases, it would have the same length, but we could rather think of translation as an affine transformation, say).
And finally, with two bases, they need not span the same vector space. You can have a three-coordinate vector space all you like, if you have only two basis vectors you ain't spanning it. At best you can hope for an orthogonal projection from one to the other, and lose some nuance.
Eventually, with bilinguality, you learn not to translate words. Concepts live in different languages and describe a reality. Usually you can describe that reality in two different languages, but sometimes not.
Big claim but not much substance. They should try to really understand linear algebra first, and also linguistics a bit. Semantic domain (from linguistics) is a better way to describe it, where using sets (from math) might better convey what they want to say.
There's one aspect that I think the article starts to hint at, but doesn't quite make the jump to is that words in a language just map to a subset of concepts that don't necessarily have the same subset boundaries in other languages.
If you think back to the meme from a decade or two ago about how men and women perceive colour [1], where e.g. "pink" to a man covers a whole range of colours to a woman, then that kind of hints at the idea.
One example back in the realm of vocabulary is the English word "happy". This embodies a range of meanings from joy, willingness, pleased, contentment, satiation, etc. There might be some overlap in some of these meanings with other words like "joy" or "excited", that don't have the same overlaps in other languages. E.g. "happy" might be translated to French as "heureuse" for the senses of pleased or content, but not for willingness sense.
Similarly, the French word "dommage" can be translated into a whole bunch of English words that aren't normally synonyms of each other - pity, damage, shame, harm.
This kind of nuance can lead to two opposite problems when translating - when the meaning is limited to a subset of possible meanings by context, and the wrong one is chosen in the foreign language, and when the author's meaning embodied multiple meanings and the chosen translation doesn't cover all of them.
Some of these features can lead to the humour in subtle jokes being lost in translation, e.g. "he'd be late to his own funeral".
Communication/language depends on shared context. The more context you share the shorter the trigger for evoking that thing and that context. And if you share no context communication becomes very difficult.
And honestly, without a lot more communication even with a person that speaks your language you have no idea if you actually have a shared context. While an American from NYC and one from some backwater town in Kansas share a lot of context but there is a lot of context they don't, so as communication becomes more detailed between them it's very likely that 'translation' between each other will be somewhat incorrect.
This is also why lawyer speak is so particular. Language is fuzzy in most cases. Only language that relates to discrete physical objects gets closer to the binary state of exactness described in the article.
Tangent: I really like vornoi diagrams and part of me thinks there's a hidden, precious concept they represent. I didn't get their relation to the article but was wondering if they have applications in engineering/sciences.
A Voronoi diagram is created when you color every point on an image according to which discrete point it is closest to.
So in this case, I see the diagrams as representing the boundaries drawn when projecting / quantizing complex ideas into a set of central points that are insufficient for catching all of the nuance of the original. How well can you adapt a nuanced idea to a different space?
If Language A has an idea that exists at one point in space, which is the closest word in Language B that might be used to represent it? A Voronoi diagram is one possible way of illustrating it.
Tangent on your tangent: this GDC presentation from 2016 is probably my favorite real-world application of Voronoi Diagrams, and uses them for N-player split-screen camera control: https://www.youtube.com/watch?v=tu-Qe66AvtY&t=1594s
I have a lingering dream in the back of my mind to make a single-couch Liero-style casual game for N-players with good dynamic camera support using this technique.
What a trendy article, in tune with our recently linear-algebraic turn in how we see language thanks to LLM's.
But I think this exposes an even greater problem, where words thought to be direct translations will always drift in vector value as they are weighted for attention within their respective corpora. Are we on the brink of translation-nihilism?
This isn't even limited to complex phenomena or shades of snow. Even "I like" is a different construction in many languages, in an unexpected way to new language learners.
Just read Wittgenstein (The Blue+Brown Books / Philosophical investigations), and this confusion will go away. The difference between translation, definition, and explanation needs to be understood.
This article assumes that concepts are somehow precise coordinates within a single language; that's not the case, at best, speakers of a language mutually approximate a relatively consistent representation, but like, look at a word like yeet or whatever: we decided as a society on its meaning while it was being developed, as it were. Furthermore, it never rigorously defines what it means by translation. It claims 上京 is a single basis meaning moving to Tokyo, for example, but that isn't even an accurate translation: the individual components represent superior/greater/above and Tokyo and as an idiomatic phrase it represents the concept of moving to the capital for a better life. Something like "moving on up" or the like in some vernaculars of English, and idioms translating to idioms is a form of translation. It's disingenuous to represent the first concept as a single basis but not the second.
Similarly, it claims mono no aware (物の哀れ) is unable to be translated, but, again, more literally "translated" is saying "the sorrow within things" character by character, and, only as an idiom has the full contextual understanding. It's not really a single point even if it's rather accurately located in a hypothetical embedding space by Japanese speakers. Imo, an English translation of the concept is "everything is dust in the wind", only 2 more individual conceptual units than the original Japanese phrase, and 3 of them are mainly just connecting words, but it's understood as a similar idiom/concept, here.
Concepts are only usefully distinguished by context and use.
By the author's own argumentation: nothing is translatable (or, generally, even communicatable) unless it has a fixed relative configuration to all other concepts that is precisely equivalent. In practice, we handle the fuzziness as part of communication and its useless to try and define a concept as untranslatable unless you're also of the camp that nothing is ever communicated (in which case, this response to the author's post is completely useless as nobody could possibly understand it enough internally for it to be useful. If you've read this far, congrats on squaring the circle somehow)
Well said! To add on: if meaning is largely not "in" the words themselves, but embedded in a shared cognitive space, then in order to have a truly singular (ie "untranslatable") basis point would require positing unique cognitive mechanisms or some experiential quality that is unknown to members of the target language. But as you pointed out, most concepts do have an analogous representation in most languages, even if the tokens in use appear superficially different. And this is merely because the context in this case is a shared cognitive substrate (the low-level operating system, if you will) consisting of sense data, emotions, and so on, which in its fundamental operations does not substantially differ between members of the human race - or so I would argue. In either case, what matters seems to me to be not so much the actual tokens but the experiences or cognitive context in which they are embedded.
This. Two speakers of the same language only have approximately the same understanding of the meanings of the words they both use. Communication succeeds because we are constantly seeking and correcting misunderstandings that arise due to no two people speaking exactly the same language.
The same process that allows two speakers of the same language to communicate adequately allows one to translate from one language to another. If it were truly impossible to translate from one language to another, we would be unable to perceive this and argue about it. The recognition and correction of errors is part of the process of translation just as it is part of the process of communication in a single language.
I like this as an analogy but not as an explanation. In fact, if you’re unfamiliar with linear algebra, this might be a nice way to think about projection onto a different set of basis vectors. But even the best human translators can be deeply at odds over what translation is appropriate for a term in its context. There’s never a right translation, let alone a uniquely right translation.
Reading any poem that makes use of extensive wordplay within a language shows why there will always be some untranslatable aspect. You can't create all the exact shades of a single pun if all those shades aren't in a different language.
Go translate an ee cummings poem and make sure to retain all its meanings.
The article seems to think that a word is untranslateable if there is no single word in the target language. If I'm not misreading the article, then this is completely obvious -- just consider the number of words in English and the number of words in almost any other language, and you will find that there are more English words than the other language. It is now clear that there exist English words that don't correspond to a single word in the other language.
> It is now clear that there exist English words that don't correspond to a single word in the other language.
But that's true of any language. Not only that, but English uses loanwords heavily which are often Anglicisations of words from other languages, which may not in themselves be just one word.
"Ho ho ho", the flag-waving Little Englander types say, "Gaelic is such a stupid language, they don't even have a word for 'television', they just say 'television' in a stupid accent!"
But English also has no word for "television". Worse, the word "television" isn't even just a loanword, it's two words from two different languages, "tele" from Greek and "vision" from Latin. What a bodge job! Imagine letting something like that slip through to production use!
The hypothetical Catalan-Hungarian inventor of it in another leg of the trousers of time may have called it llunylátás, and then where would we be?
Well, most languages would have some variant of that word to mean "television", as they do now, I expect.
The English word "galore" (meaning "sufficient" shading towards "more than enough") comes from the Gaelic words "gu leòr", (goo lyaawr, the grave accent above the o makes the vowel sound longer). What a silly language English is, doesn't have a word that means "more than you're ever likely to need", has to steal one from Gaelic and then spell it wrong.
Oh, they use this word "whisky". You know what that means? It means "uisge beatha" but they only say the first word, in a silly accent because they can't pronounce it properly.
Quite often there's no single word for a thing you're trying to translate but that doesn't mean it's untranslateable. English has only one single word for rain, for example, but Gaelic has about half a dozen of which the only ones I can reproduce here are "uisge" (that word again) which just means "water", and "fras" which is more like a gentle shower. The rest of the words in the Gaelic of the North-West of Scotland that refer to rainy weather are, of course, profane in the extreme.
"English also has no word for "television" Oh goodness sake. OF COURSE English has a word "television". The fact that you can trace its etymology back to Greek and Latin doesn't mean it's not an English word. If you confronted a native speaker of Latin who also spoke Greek (a common situation back then, also vice versa), they would have no clue what "television" meant any more than most people would know what a "Fernseher" is.
He would know both tele and vision. Remote viewng? Some kind of magic?
I found the word τηλαυγής (telauges), "far-shining", meaning "visible from far away". Like a lighthouse. So some theoretical ancient might hear "television" and understand it as "looking at distant landmarks".
Might think it referred to someone with good distance vision (not myopic).
Your comment is silly, but I can play along.
English people will say something like: Germans have a word for everything.
Many of which are just sentences with the spaces removed.
Australia’s have a lot of those too, or worse: our speech is often nothing but a handful of vowels and a swarm of apostrophes.
> Australia’s have a lot of those too, or worse: our speech is often nothing but a handful of vowels and a swarm of apostrophes.
VLIW natural language.
I love it!
Now here's where it gets interesting: there is no agreed-upon definition among experts what a word is. So there's no point in arguing about it if the thing we're arguing about doesn't even have a rigorous definition.
Enter: the morpheme
The English word for “television” is television.
Does no-one else use the english word "telly" for a television?
Not to mention that the English dictionary is stuffed with legacy words that no natives understand. Is it even part of the language if no native use it? It's another debate.
It's stuffed with unusual and rare words, and no native speaker understands all of those.
But I think most of those words are in use somewhere, for something.
Is there any overlap between these unusual and rare words and GRE vocabulary?
The GRE vocabulary is actually based on French, Latin and Greek, not English. Much less rare and unusual once you realise that.
I’ve seen a lot of weird takes on the internet, but “English has no word for television” takes the crown.
I think it's meant to demonstrate how 'gaelic has no word for television' is a dumb statement.
Do you really not know what that statement means?
That isn’t a proof. Synonyms can bolster the enumeration sans augmenting novelty.
It kind of is a proof if we assume that single words can be translated at all. Translate a single word from Language X (more words) to language Y (fewer words) and back. I can't uniquely recover all the words in Language X that way.
Does translation have to be a bijection? I don’t think so.
I don't know about that. For many practical purposes probably not?
I'm just on the thread following this idea: "The article seems to think that a word is untranslateable if there is no single word in the target language"
So we're talking about "translatability" of single words. Mapping multiple words of language X to one word of language Y is going to have some effect on translation.
Nope, because of synonyms as well.
That is the crux of the article premise: each synonym conveys similar denotations (principle component is I think what the article called it), but usually with some difference in connotations (the off axis contributions). You can nudge the languages vectors towards each other by adding enough synonyms and modifiers together, but they are always a little bit off even still
So, really, this can be simplified to the question "can written text fully convey all human concepts", some of which having labels in only some languages, which is an obvious "no".
Synonyms rarely have identical meanings for example:
Happy: Joyful, cheerful, merry, delighted
Or
Beautiful: Lovely, pretty, attractive
The only truly identical synonym I can think of is flammable and inflammable
I thought there was a difference between those two in how they start burning, like one needs an external flame to start while the other can burst into flame without an obvious ignition source.
Perhaps perhaps.
But what joyful means to you likely differs from what it means to me, simply because we haven’t read the exact same literature and had the same conversations.
True true.
True, but many languages now have words that were absent from their earlier vocabularies. Shakespeare did not have the option to use 'telephone', 'semiconductor' or 'entropy'.
I think the reasonable reader will conclude it's unlikely for any two languages to share exactly the same vocabulary, accounting for synonyms.
Not sure this approach really accounts for the difference between a language like German where you have one compound word for a concept that would require multiple words in English. For one good example, the German "Nomenkompositum" is "compound noun" in English.
Some giant portion of English vocabulary actually are compound words. English loves using compound words but only if the roots are sourced from Latin or Greek: words like electrocardiogram ("electronic heart picture", sourced from Greek), agriculture ("field nurturing", from Latin), and telecommunication ("far sharing", a hybrid of Latin and Greek roots). Probably the overwhelming majority of the words in an English dictionary will be compound words, and people regularly coin neologisms ("new words") using this formula.
An English speaker might be willing to accept componoma ("names placed together", Latin) or synthetonoma (also "names placed together", Greek) without breaking stride.
I wasn’t saying there are no compound nouns in English at all. If you count portmanteau words like “Brexit” and jargon there are a massive abundance of them. All I was saying is the approach would count certain concepts as untranslatable when they clearly aren’t, simply because in one language you have a compound word and in the other language you use several words to express the same concept. It’s definitely not untranslatable but the translation function isn’t one to one.
I think your point basically asks the question "what counts as a word" because clearly German has infinitely more "words" than would ever appear individually in a dictionary. I'm saying that English does, too.
A couple of ape cubs who learned sign language saw a duck and invented "waterbird". We have to know two dead languages to know if aquaplaning or hydroplaning is the right word.
Language while involved in that water related process is probably drawn from Anglo-Saxon or possibly Old Norse. No refined Mediterranean stuff.
What sticks out to me is that the first word in these ends with a vowel so they don't sound like compound words.
> English loves using compound words but only if the roots are sourced from Latin or Greek: words like electrocardiogram
This is false; English loves using compound words. One example of such a compound word is "fire department", which has identical syntax to the German compound "Feuerwehr". Whether a compound word is spelled with or without internal spaces is not a fact about the language, it's a fact about the spelling.
If you ignore the spaces, the only real difference between German and English compound nouns are the infixes between elements to show bracketing. Case in point: Nomenkompositum
It's the same structure in both languages. Just because it's written as if it were a single unbreakable word doesn't mean it is--or contrariwise, the fact that it's written as two things with a space in between doesn't mean that it's two "words" in English. The problem lies in the meaning of "word." Is 'doghouse' one word in English, while 'dog house' is two? No.
That's just a difference in orthography. English could easily have had an orthographic standard where we write "compoundnoun" for compounds. This is in contrast with a language like French, where compound nouns are relatively rare. Compare English "Olive oil" and German "Olivenöl" with French "huile d'olive". In French you need to have a preposition to combine the two nouns, whereas English and German do noun-noun composition.
You are right but neither yours nor those of the previous posters are good examples of compound nouns.
These examples have just the meanings of a noun + adjective or of a noun + noun in genitive case, where some languages are lazier than others and omit the markers of case or of adjectival derivation from noun, which are needed in more strict languages.
There are also other kinds of compound nouns, where the compound noun does not have the meaning of its component words, but only some related meaning (usually either a pars pro toto meaning or a metaphorical meaning). Those are true compound nouns, not just abbreviated sequences of words from which the grammatical markers have been omitted.
Such compound words were very frequent in Ancient Greek, from where they have been inherited in the scientific and technical language, where they have been used to create names for new things and concepts, e.g. arthropod, television, phonograph, basketball, "bullet train" and so on.
This kind of compound words are almost never translatable, but they are frequently borrowed from one language to another and during the borrowing process sometimes the component words are translated, but the result is not a translated word, it is a new word that is added to the destination language.
> There are also other kinds of compound nouns, where the compound noun does not have the meaning of its component words, but only some related meaning (usually either a pars pro toto meaning or a metaphorical meaning)
The example that people often quote from German is “kummerspeck” which would literally translate as “grief bacon”, but means weight you put on through comfort eating having gone through a bereavement or other trauma.
> There are also other kinds of compound nouns, where the compound noun does not have the meaning of its component words
Wouldn't cranberry morphemes be good examples this type of relationship? I don't know if, in the eponymous example, the cran- being bound precludes it from being counted as a closed compound word or not though.
> and you will find that there are more English words than the other language. It is now clear that there exist English words that don't correspond to a single word in the other language.
You're forgetting about synonyms. The common adage that English has the largest vocabulary stems from the fact that it often has multiple words for the same thing. Sofa, couch. Autumn, fall. Etc etc. Other languages generally don't do this. I've never heard anyone suggest that English has words for more concepts.
Sofa and couch are only interchangeable in some contexts. They are different “flavors” of similar ideas.
This becomes immediately apparent (and relevant) when writing fiction or poetry. At least it does to me.
Non-fiction and spoken English do not highlight the subtleties between these words because using them interchangeably in the same work is considered bad form.
There are relatively few cases of true synonyms in English (or any language). There are subtle differences in meaning, register, etc that are recognized by native speakers.
I don't what you mean by a "true" synonym, but that is false. There are historical reasons there is a lot of word-doubling. The fact that synonyms might carry additional subtle connotations -- i.e. maybe you find "autumn" more poetic than "fall" -- doesn't change the fact that they are synonyms.
One of the main points of language is to evoke ideas in others' minds. "Autumn" and "fall" will normally evoke different ideas in US english (the former bringing about views of the cozier parts of the season, and the latter being more sterile, used to refer to a particular region of time). Maybe we disagree about what a "true synonym" is, but that distinction seems important to me.
But perhaps all languages have a countably infinite number of words, in which case that proof doesn't work. (In English we have: legless, leglessness, leglessnessless, leglessnesslessness, ... It's not a great example, but it's good enough.)
Even if the number of words in a language were finite we wouldn't have a reasonable way of counting them. There are too many kinds of fuzziness involved in deciding what counts as a "word" and you can't ignore the borderline cases because the borderline cases vastly outnumber the straightforward cases.
"...there are more English words than the other language" There might be more words in some English dictionaries than in some dictionaries of other languages, but that may just be due to a lot more effort having gone into English lexicography than X language lexicography. I doubt that most native speakers of English know more English words than equally educated native speakers of some other language know words of their language.
I think he’s rather arguing that no language is perfectly translatable to another. He only uses “untranslatable word” an instance of that claim
There's a real irony that the examples are coming from Japanese since it is an agglutinative language.
I think people don't realize how weird language is. Like you could look at Chinese and call each sentence a "word" as there are no spaces. What's the difference between that and a compound word like "nighttime" or the whole German language where you got words like Krankenwagen ("patient" + "car").
Now this doesn't mean there aren't words or phrases that aren't translatable. But the thing is we can always translate the words themselves. What we can't always translate is the meaning behind them. I think the best example of this comes from Star Trek and the Tamarian Language[0,1]. "Sokath, his eyes open!" The problem with communication is not that the words don't translate, it is that the meaning behind them doesn't. Just as people struggle with idioms when learning American English or why someone might be confused about why someone "shit in the milk" or "fucked the dog". Words are an embedding. A compression.
The thing people are constantly forgetting, but is more important than ever in a globally connected world, is that words are not perfect representations of thoughts. We compress our thoughts into them and hope the person on the other side can decompress them. It is why you can more easily communicate with your close friends who have better context than you can with another person that natively speaks your language and is why someone that learns a new language can speak perfectly well but still struggle to communicate. Language is not just words, it is culture[2]. So in a much more connected world today we have these disconnects in culture and thus interpretation of what people say. I know every one of you has been told to "speak to your audience" but how do you speak to your audience when your audience is everybody and when you don't know who your audience is? The new paradigm requires us to be much better interpreters than we were before. Least everyone is going to sound crazy, other than those you frequently talk to and have that shared understanding.
[0] https://memory-alpha.fandom.com/wiki/Tamarian_language
[1] https://www.youtube.com/watch?v=3-wzr74d7TI
[2] This is, btw, why people argue for embodied AI being so critical. Not because LLMs can't appear to grasp the language, but because we as humans have embodied our language so deeply you probably didn't even realize that I used the word "grasp" to refer to an abstract concept and not something you can actually touch with your hand.
You're correct.
In another blog post where he uses "shibui" as an example of an untranslatable word, he says, "Saying shibui like that, in a mere second, conveys what would otherwise make a clunky and unnecessarily long digression."
At the root of nearly all the blog posts like this one (basically explaining why they don't agree with a widely held belief) is a redefinition of a term or word into something very specific that contradicts the common definition.
Yeah, I was interpreting 'untranslatable' to mean what it says, but they meant 'untranslatable with only a couple words', which is a very different claim.
I think a succinct way to describe my thoughts on linear algebra/language is that language has high dimensionality (ie many different basis vectors that may not necessarily be orthogonal) and that individual languages use a unique coordinate system to express thought. Each language is a lossy approximation of all conceivable thought and some languages can more efficiently represent the “all thoughts” vector space because they have basis vectors that point in more uncommon directions (like the go to japan example). So while you can more or less point to any thought in any language, some thoughts are easier to express in certain languages, which the post (and me) agree to be untranslatable words.
I tried to find the really interesting article about language and color that describes how some cultures use different naming schemes for colors but couldn’t find it. It talked about how back in the day we don’t know orange as a color, we just thought it was red-yellow and only after the fruit was distributed did the word for the color catch on. Here’s the best article I can find that talks about this phenomena https://burnaway.org/magazine/blue-language-visual-perceptio...
> Each language is a lossy approximation of all conceivable thought
I'm not quite sure I understand this—I do have mental sensations/processes sans language, but I would not characterize them as "thoughts". To me, a thought is inherently linguistic, even if they relate to non-linguistic mental processes. So to me, learning a new language is very literally learning how to think differently.
I think we’re in agreement, but I’m afraid I don’t have the philosophical language to precisely pin my mental model into words (what a meta conundrum lol). I’ll try my best here, but I may come back in a few days with an edit if I can more coherently write my ideas.
I take a slightly more narrow definition of “thoughts” that may be more akin to “expressions” - ideas that can be communicated, so excluding non-linguistic mental processes. I think that may be where we disconnect. A lot of my idea about thoughts comes from the Borges story, Funes the memorius (short story about a dude who could not forget - interesting read and really clarifies my feelings on my definition of “all possible thought”). In the story he talks about tree leaves, but instead imagine needing a unique linguistic scheme for every single unique snowflake you ever see. It would be a linguistic nightmare! Therefore language must generalize otherwise it becomes noncommunicable and that generalization to me induces the “lossy approximation” I attribute to language in my prior comment.
So, in my head Funes’s mind represent the abstract space of all possible thoughts. When we use language, we are stacking words/sentences/paragraphs/etc together almost like vector addition trying to reach a particular point in the thought vector space. Some languages have really clean ways of getting to certain thoughts while others take a mouthful and still don’t get you exactly there (物の哀れ example from link).
I agree with your statement on new languages being different thinking. As you follow that vector addition process to get to the “thought,” different languages will take you on different paths to get to your destination thought because languages encode those vectors differently, even if the destination thought is the same. In my mental model, the act of thinking is putting those language vectors together and tracing their path to get to your thought.
And if my comment still makes no sense - I might have to incubate this thought a bit more :) but I do recommend the story- it’s a quick, thought provoking read.
> I take a slightly more narrow definition of “thoughts” that may be more akin to “expressions” - ideas that can be communicated, so excluding non-linguistic mental processes.
I was glad to read this because it seemed too neat and tidy for "thought" to necessarily be able to be encoded into language, especially in the presence of frequent miscommunication between people that share language, culture, and context.
On language and thinking, I agree that new languages promote thinking differently. But it seems that the difference has to fall short of the Sapir-Whorf hypothesis of informing perception or experience. Which would then limit the extent to which thought, as informed by language, would influence the way one would compose a linguistic representation of some thought/idea/"blob of meaning to be communicated." All to suggest that there is a broader landscape of "thinking-like activities" than those which would be able to be encoded linguistically.
Maybe it's simpler to say that I think of language as more lossy than thought.
I would argue that you can consider those thoughts. But this is the difficult bit, I've had the experience before of thoughts/feelings whatever ypu want to call them where words fall short. Knowing multiple languages helps a bit but it still falls short sometimes (very rarely).
Language is very effective at this, but I don't think thought is inherently linguistic.
To me language is just a way to label, group or organise these things. So when you learn a new one you learn a new 'labeling system/taxonomy' does that sound familiar?
> If the mere sight of the above is like a punch in the face for you, don't worry. I'm not going to math you to death in what follows. I will only remind you of a tiny basic part of it that I think relates to languages.
Yes, that mathematical expression is like a punch in my face, but not for the reason you think. I am offended that the rank of the matrix does not match the dimension of the matrix, not that I'm seeing a matrix.
You probably mean that the size of the matrix is incompatible with the size of the vector?
Yes, and embarrassingly there are two mistakes in the comment! I used "rank" of the matrix rather than dimension too.
It's a 3x3 matrix with 3 independent rows. The rank matches the dimension.
He probably meant to say "vector" the second time he said "matrix".
Interestingly enough for this morning's walk I was musing over the tension between the hypotheses that: 'LLMs can map between languages in the vector space' (thus languages are ~equivalent); and 'Language affects thoughts' (as in German is good for Philosophy and English for getting things done).
If both these thoughts are true, then it would appear that languages have topological characteristics. We can (topologically) map from one to another, 'thoughts' (that is a complex of words) form 'paths on the language manifold' and certain paths may be more 'natural' in one topological form than the other.
My take is the human brain learns concepts primarily through differentiation. To a newly born child who has no concept of door or a wall, has no reason to see the two as being different parts. Different languages form different differentiations, but one can always compound concepts, and differentiate them differently.
To extend: there will also be general alignment tendencies towards those readily mapped and expressed concepts within available language. Hard but useful concepts can get mapped to idioms. Modes of categorization will be influenced by these factors, which in turn influences many processes.
What is a word in one language is a collocated words in another, possibly context-dependent.
We can look no further than English: "man can do something," "man can do not do something" (i.e., can do but does not have to), then pretty straightforward "man can not do something" and, all of sudden, to express that man cannot decline some obligation, we say "man can not help but do."
It is not translation per se, but shows that some parts of language were evolved to tiptoe around non-customary things, in this case, double negation. And double negation is very easy in some other languages.
"'Language affects thoughts' (as in German is good for Philosophy and English for getting things done). If both these thoughts are true..." Well, the second one isn't true (I omitted the first one in this reply). It is simply not the case that German is good for philosophy and English for getting things done, and similarly for most other such claims (French is better for talking about love, Italian is better for operas, etc. etc.).
My personal analogy, useful in my early days: Translating is like finding a vector in another space that points in the same direction or carries a similar magnitude of meaning.
In other words:
The source sentence is a vector in “language A space.”
The target sentence is a vector in “language B space.”
A good translation finds a vector that has the same direction (same meaning, intent, tone) even though it lies in a different coordinate system (the new language).
I know what you mean, but semantics is about relative positions of points in a given space. Comparing two points from two different spaces is apples and oranges. I feel like this analogy should be salvageable with a small tweak, however.
when did you develop this analogy? Is it well before 2015, when Google demoed a vector model that solved Man:Woman,King:_____ ?
Yeah, I was hoping the article would say something about word vectors and linear algebra.
Are they multiplying a 3x3 matrix by a 2 component vector ?
Yeah that made me twitch also.
In that one case, yeah; I don't think they're going for anything more than general illustration here.
The text that follows does take on a new meaning though, for those that know linear algebra:
If the mere sight of the above is like a punch in the face for you, don't worry.
Almost makes me wonder if it was intentional.
Bears. Beets. Battlestar Galactica!
But what does that illustrate?
Like, what a matrix looks like? Seems fine.
It also has mixed square brackets and curved parentheses. I stopped reading the article when I saw this.
Brackets can be any shape, it's fine. Like (3)*[4] is still 12. But that matrix-vector product is undefined.
Everyone knows you need a 4x4 matrix to do translation, anyway. Now, scale, rotation, and skew...
> a gentle, poignant sadness or pathos felt in response to the transient nature of all things, a deep awareness of their impermanence that evokes a subtle, bittersweet sorrow and a profound, quiet empathy for their passing.
https://en.wikipedia.org/wiki/Saudade ?
> I hope these rather unorthodox leaps between linguistics and mathematics helped make it almost obvious that some words and ideas are untranslatable in practice. I also hope you don't take the analogy too seriously, because it won't go much further than this.
Phew! Thanks for clarifying.
There is a better thesis coming from the late philosopher W.V.O Quine: indeterminacy of translation [1]
[1] https://plato.stanford.edu/entries/quine/#IndeTran
Ctrl+F: Quine
That indeterminacy of translation isn’t mentioned is a huge shortcoming of this article.
The word 'word' is polysemous, and also vague. Polysemous: it can refer to written words or spoken words (or signed words, in sign languages).
Vague: does it mean all the inflected forms of a word, or just the stem without inflection? Example: is (are?) 'walk', 'walked', 'walks' and 'walking' four words, or one? What about "stand/stood"? (And languages where the bare root or stem can't appear by itself, like verbs in Spanish.) Derived words, like 'push' and 'pushy'.
Do compound nouns count as a word, or do only the parts count? Example: 'doghouse' (or 'dog house'). What about idioms? Example: 'to crane his/her/my/your/our neck(s)'.
What about different pronunciations? Is 'roof' pronounced to rhyme with 'aloof' the same word as 'roof' pronounced with the vowel of 'put'? And different spellings but same pronunciation: 'bear' vs. 'bare'.
What about words with different grammatical categories, like 'push' as a noun ("I gave her a push") or a verb ("I pushed her"). Or the same word with virtually unrelated meanings, "I pushed her on the swing" vs. "I pushed my ideas."
上京 jōkyō is less a regular word and more a form of written shorthand. It would not be used in speech, and even in writing it's ambiguous: not only does 京 kyō simply mean "capital", but there's a district of Kyoto that's also 上京, only read Kamigyō.
The closest English equivalent is abbreviations like "PC". They're perfectly usable in context, but if you see one standing alone it's not clear if it's personal computer, politically correct, Peace Corps, etc etc.
It's a tenuous analogy, but if you along with it, you can take it further.
You could consider the "cost" of expressing a word as some kind of metric or norm on the vector. What in one language/basis is a simple Kronecker delta, in another is a very complex vector (of course if it were the same vector in two bases, it would have the same length, but we could rather think of translation as an affine transformation, say).
And finally, with two bases, they need not span the same vector space. You can have a three-coordinate vector space all you like, if you have only two basis vectors you ain't spanning it. At best you can hope for an orthogonal projection from one to the other, and lose some nuance.
Eventually, with bilinguality, you learn not to translate words. Concepts live in different languages and describe a reality. Usually you can describe that reality in two different languages, but sometimes not.
Big claim but not much substance. They should try to really understand linear algebra first, and also linguistics a bit. Semantic domain (from linguistics) is a better way to describe it, where using sets (from math) might better convey what they want to say.
There's one aspect that I think the article starts to hint at, but doesn't quite make the jump to is that words in a language just map to a subset of concepts that don't necessarily have the same subset boundaries in other languages.
If you think back to the meme from a decade or two ago about how men and women perceive colour [1], where e.g. "pink" to a man covers a whole range of colours to a woman, then that kind of hints at the idea.
One example back in the realm of vocabulary is the English word "happy". This embodies a range of meanings from joy, willingness, pleased, contentment, satiation, etc. There might be some overlap in some of these meanings with other words like "joy" or "excited", that don't have the same overlaps in other languages. E.g. "happy" might be translated to French as "heureuse" for the senses of pleased or content, but not for willingness sense.
Similarly, the French word "dommage" can be translated into a whole bunch of English words that aren't normally synonyms of each other - pity, damage, shame, harm.
This kind of nuance can lead to two opposite problems when translating - when the meaning is limited to a subset of possible meanings by context, and the wrong one is chosen in the foreign language, and when the author's meaning embodied multiple meanings and the chosen translation doesn't cover all of them.
Some of these features can lead to the humour in subtle jokes being lost in translation, e.g. "he'd be late to his own funeral".
[1] e.g. https://www.psychologytoday.com/us/blog/brain-babble/201504/... or https://digitalsynopsis.com/design/male-vs-female-color-perc...
Communication/language depends on shared context. The more context you share the shorter the trigger for evoking that thing and that context. And if you share no context communication becomes very difficult.
I wasn't aware that that idea was in dispute.
And honestly, without a lot more communication even with a person that speaks your language you have no idea if you actually have a shared context. While an American from NYC and one from some backwater town in Kansas share a lot of context but there is a lot of context they don't, so as communication becomes more detailed between them it's very likely that 'translation' between each other will be somewhat incorrect.
This is also why lawyer speak is so particular. Language is fuzzy in most cases. Only language that relates to discrete physical objects gets closer to the binary state of exactness described in the article.
Tangent: I really like vornoi diagrams and part of me thinks there's a hidden, precious concept they represent. I didn't get their relation to the article but was wondering if they have applications in engineering/sciences.
A Voronoi diagram is created when you color every point on an image according to which discrete point it is closest to.
So in this case, I see the diagrams as representing the boundaries drawn when projecting / quantizing complex ideas into a set of central points that are insufficient for catching all of the nuance of the original. How well can you adapt a nuanced idea to a different space?
If Language A has an idea that exists at one point in space, which is the closest word in Language B that might be used to represent it? A Voronoi diagram is one possible way of illustrating it.
Tangent on your tangent: this GDC presentation from 2016 is probably my favorite real-world application of Voronoi Diagrams, and uses them for N-player split-screen camera control: https://www.youtube.com/watch?v=tu-Qe66AvtY&t=1594s
I have a lingering dream in the back of my mind to make a single-couch Liero-style casual game for N-players with good dynamic camera support using this technique.
What a trendy article, in tune with our recently linear-algebraic turn in how we see language thanks to LLM's.
But I think this exposes an even greater problem, where words thought to be direct translations will always drift in vector value as they are weighted for attention within their respective corpora. Are we on the brink of translation-nihilism?
This isn't even limited to complex phenomena or shades of snow. Even "I like" is a different construction in many languages, in an unexpected way to new language learners.
Just read Wittgenstein (The Blue+Brown Books / Philosophical investigations), and this confusion will go away. The difference between translation, definition, and explanation needs to be understood.
I'm thinking this is a joke, right? Wittgenstein always seemed to me to be a good way to get confused.
This article assumes that concepts are somehow precise coordinates within a single language; that's not the case, at best, speakers of a language mutually approximate a relatively consistent representation, but like, look at a word like yeet or whatever: we decided as a society on its meaning while it was being developed, as it were. Furthermore, it never rigorously defines what it means by translation. It claims 上京 is a single basis meaning moving to Tokyo, for example, but that isn't even an accurate translation: the individual components represent superior/greater/above and Tokyo and as an idiomatic phrase it represents the concept of moving to the capital for a better life. Something like "moving on up" or the like in some vernaculars of English, and idioms translating to idioms is a form of translation. It's disingenuous to represent the first concept as a single basis but not the second. Similarly, it claims mono no aware (物の哀れ) is unable to be translated, but, again, more literally "translated" is saying "the sorrow within things" character by character, and, only as an idiom has the full contextual understanding. It's not really a single point even if it's rather accurately located in a hypothetical embedding space by Japanese speakers. Imo, an English translation of the concept is "everything is dust in the wind", only 2 more individual conceptual units than the original Japanese phrase, and 3 of them are mainly just connecting words, but it's understood as a similar idiom/concept, here.
Concepts are only usefully distinguished by context and use.
By the author's own argumentation: nothing is translatable (or, generally, even communicatable) unless it has a fixed relative configuration to all other concepts that is precisely equivalent. In practice, we handle the fuzziness as part of communication and its useless to try and define a concept as untranslatable unless you're also of the camp that nothing is ever communicated (in which case, this response to the author's post is completely useless as nobody could possibly understand it enough internally for it to be useful. If you've read this far, congrats on squaring the circle somehow)
Well said! To add on: if meaning is largely not "in" the words themselves, but embedded in a shared cognitive space, then in order to have a truly singular (ie "untranslatable") basis point would require positing unique cognitive mechanisms or some experiential quality that is unknown to members of the target language. But as you pointed out, most concepts do have an analogous representation in most languages, even if the tokens in use appear superficially different. And this is merely because the context in this case is a shared cognitive substrate (the low-level operating system, if you will) consisting of sense data, emotions, and so on, which in its fundamental operations does not substantially differ between members of the human race - or so I would argue. In either case, what matters seems to me to be not so much the actual tokens but the experiences or cognitive context in which they are embedded.
This. Two speakers of the same language only have approximately the same understanding of the meanings of the words they both use. Communication succeeds because we are constantly seeking and correcting misunderstandings that arise due to no two people speaking exactly the same language.
The same process that allows two speakers of the same language to communicate adequately allows one to translate from one language to another. If it were truly impossible to translate from one language to another, we would be unable to perceive this and argue about it. The recognition and correction of errors is part of the process of translation just as it is part of the process of communication in a single language.
I like this as an analogy but not as an explanation. In fact, if you’re unfamiliar with linear algebra, this might be a nice way to think about projection onto a different set of basis vectors. But even the best human translators can be deeply at odds over what translation is appropriate for a term in its context. There’s never a right translation, let alone a uniquely right translation.
From the title, I thought he was going to explain "eigenvalue".
It could be the case that it's not even "effectively", "in practice", etc.
N^{any constant} is not bijective with a single R.
Reading any poem that makes use of extensive wordplay within a language shows why there will always be some untranslatable aspect. You can't create all the exact shades of a single pun if all those shades aren't in a different language.
Go translate an ee cummings poem and make sure to retain all its meanings.
Douglas Hofstedter's book The Ton beau de Marot.
Another physicist who thinks he can solve problems in a domain he knows nothing about with linear approximations.
There's an xkcd devoted to this problem, even using computational linguistics as an example, IIRC.
I cannot take seriously an article that presents a 3x3 matrix being multiplied by a 2-vector as an example of linear algebra. Gibberish.
[dead]