Talk:Alt code
This is the talk page for discussing improvements to the Alt code article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
What about making a separate page with the list on it?
[edit]We could make a page called "Windows Alt keycodes list" that would contain the list, while this page just held information about it.Humphreys7 (talk) 14:22, 6 March 2008 (UTC)
- Since no one has said anything on the topic, I am going to go ahead and move the table to: "Windows Alt keycodes list". Humphreys7 (talk) 14:15, 12 March 2008 (UTC)
Is this list necessary?
[edit]The alt keycode numbers are just ACSII codes for the given symbol. Is this list necessary? Jaxal1 17:56, 15 February 2006 (UTC)
- Hi Jaxal1, it basically is not ASCII. I just made a complete rewrite of the article. Still, the list is no more necessary, as the correct lists can already be found on the corresponding code page article pages.
- I hope the information given is correct for all versions of Windows, regardless of the language version used (I use a German XP). --Abdull 17:17, 19 February 2006 (UTC)
- This is excellent. Thank you for change. Jaxal1 01:39, 20 February 2006 (UTC)
How to Type the letter, for example ü in Waldseemüller ?
[edit]For those characters whose decimal equivalent number is less than 256, below process is valid for them.
If you want to use "ü" (u with diaeresis) instead of "u", for example, like in "Waldseemüller", then use these keyboard strokes / keys :
Press "W", "a", "l", "d", "s", "e", "e", "m". Then press ...
Alt + 0252 (it means, first press the "Alt" (Alternative/Alternate) key in your keyboard, and keep on pressing it (or keep on holding it) with your left hand, then press the digits 0 2 5 2 in sequence, one by one, in the right-side numeric keypad of your keyboard, then release the Alt key).
Then press "l", "l", "e", "r".
Then you will get Waldseemüller.
To make it linkable (to goto the article,) use two third brackets at the beginning and at the end of the name, like this example, '''Waldseemüller map''', then you will get (hyper-)linkable Waldseemüller map.
If you want to link to that (English) article through URL, then use below (hex) code ...
For example, hex code "FC" stands for "ü" (its decimal equvalent is 252, and its html (decimal numerical) equivalent is ü). Use "%" symbol before the hex code, to express the "ü" character, in a URL.
http://en.wiki.x.io/wiki/Waldseem%FCller_map
or, http://en.wiki.x.io/wiki/Waldseem%C3%BCller_map
~ Tarikash.
I think special characters can be typed on some systems with Windows Key + Period Qwerfjkl talk 21:21, 12 March 2021 (UTC)
How to type for example, Œ (Latin Ligature OE) ?
[edit]For those characters whose decimal equivalent number is above 255, the Alt + Decimal_Equivalent_Number keycodes will not work, for most characters, except few characters like, for example, € (Euro Sign, Alt+0128, Html-Dec:€, Hex:20AC), Ÿ (Latin Cap Y with Diaereesis, Alt+0159, Html-Dec:Ÿ, Hex:178), etc which are re-mapped below 255.
The decimal equivalent number of Œ is 338. Html decimal equivalent is Œ. Html hexadecimal equivalent is Œ. Hex equivalent number is 152.
To obtain Œ, open or start the Microsoft Wordpad or Word in your computer.
Press "1", "5", "2". Then press ...
Alt + x (it means, first press the "Alt" (Alternative/Alternate) key in your keyboard, and keep on pressing it (or keep on holding it) with your left hand, then press the letter x, just one time, then release the Alt key).
Then you will get Œ. Now you can copy and paste this character where you want to use it.
If you press Alt+x again, then Œ will turn back to its equivalent hex code 152. This way you can get/reveal the hex code of other special characters also. In a website, if you see/find a special character that you want to use it, either copy and paste it, or, copy-paste into Wordpad, and use Alt+X to obtain/reveal its equvalent hex code. Use chart to find its equvalent decimal code, or, use the html hexadecimal numerical equivalent code to display that character.
Few other example: for Ω (Ohm Sign), type 3A9 Alt+x. For ∙ (Bullet Operator), type 2219 Alt+x. For ∞ (Infinity), type 221E Alt+x. For ≠ (Not Equal To), type 2260 Alt+x.
~ Tarikash.
more codes?
[edit]How many combinations are there? And how do we find them all? hehe. I was just messing around one day and found some neat ones ALT+789 = § ALT+456 = ╚ ALT+158 = ₧ ALT+154 = µ ALT+2547 = ≤ --72.146.66.200 14:08, 3 August 2006 (UTC)
- Actually, the possiblities are nearly endless. Heres an example: 3跠?}?ç(?Õ?}髿疦瞯粇S6դ. You can do nearly anything. AstroHurricane001(Talk+Contribs+Ubx) 17:42, 7 January 2007 (UTC)
╔═╦═╗■┌─┬─┐ ╠═╬═╣■├─┼─┤ ╚═╩═╝■└─┴─┘
Examples of box making. Razorclaw ¦ 20070419215452
The page is broken
[edit]look for 227. Thats where its broken. -IP User 22:02, 22 May 2007 (UTC)
- I fixed it yesterday. The problem was "rowspan." --에멜무지로 01:35, 5 July 2007 (UTC)
Redundant tables
[edit]I think the character tables in this article are completely redundant. Character tables for the obsolete code-pages can already be found in their respective articles, Code page 437, Code page 850, and Windows-1252. A character table for the complete Unicode character set would obviously be too large. If nobody objects, I will remove these tables and replace them with links to the aforementioned articles, as well as some ideas on where to get additional character charts from (e.g. Windows character map, or unicode.org). — Timwi (talk) 14:37, 1 February 2008 (UTC)
Non-Windows Alt codes
[edit]Are there Alt codes for operating systems other than Windows? Otherwise, a merge of this article with Windows Alt keycodes may be considered. --Abdull 17:23, 19 February 2006 (UTC)
Yes, there are, but it may still make sense to merge the pages, since the differences are small. DOS supported Alt codes; so does Linux (in console mode). --217.147.80.29 12:10, 23 February 2006 (UTC)
- Agree. There's nothing here that isn't provided by Windows Alt keycodes or Unicode#Input_methods. EdC 18:21, 5 March 2006 (UTC)
- Hm. I've improved it somewhat; maybe the best solution would be an article on Codepoint-based Unicode input systems which this could redirect to. EdC 19:05, 5 March 2006 (UTC)
The text currently says "The Alt key method does not work on Linux systems." However, on my Raspberry Pi running Raspbian, which is derived from Debian, Python programs using the curses library are picking up something like this. If I hold Alt and press digits on the keypad and then release Alt, I get several bytes at once, which may represent a Unicode character. If I hold Alt+Shift and do the same, I get a different result (but also several bytes). The slash, asterisk, plus, minus, and other keys also seem to change in this case (if they are the only keypad key I press). I came here looking for information on what I was getting with Alt, and Alt+Shift on the keypad. Could this be the result of my Dell keyboard alone? 24.57.210.141 (talk) 11:37, 21 September 2013 (UTC)
Article claims that only the second method of entering unicode works in inkscape and openoffice. In fact, on ubuntu 14.04.4 the third method works on both inkscape and libreoffice (and most other programs tried) but the second method fails and is ergonomically a nightmare and the first method fails. Holding down the shift key or pressing it during input will end unicode input. You may, however, continue to hold down the control key after you release the shift key and when you release it after typing the digits it will treat that like pressing enter in libreoffice but in inkscape, that will resort in an inkblot test if you fail to press the u button. The [control-shift-u] [0] [3] [a] [9] [enter] method seems to work in chrome, firefox, midori, gedit, bluefish, kate, scite, vim (in gnome terminal), inkscape, libreoffice, gnome terminal (locale en_us.UTF-8), rxvt-unicode, xterm, amarok, keypassxc, k3b, etc. but not in emacs which only works with its proprietary method ctrl-x 8 [return] 73.152.113.249 (talk) 09:17, 12 March 2019 (UTC)
Quick Key bias
[edit]Is it really NPOV to have a comment saying that "Quick Key ... excels at ....", written by the author of that program? Perhaps a user with experience of the program might be able to provide a less biased view :-) 192.118.76.35 08:16, 22 March 2006 (UTC) I would be very grateful if you would prodvide us with a review of the software. An unbiased review would be very welcome. —The preceding unsigned comment was added by 216.135.95.189 (talk • contribs). 8 April 2006
Article title
[edit]I agree that that Codepoint-based Unicode input systems would be the most accurate title, and from an Encyclopedia point of view, this article should be merged to reduce repetition. However, in my opinion, the sort of people with the computer skills and vocabulary to get through the first paragraph of "Unicode Input Methods" already understand alt codes. Most computer users (sadly enough) have a very hard time comprehending the fact that computers store text in numerical format, and will faint the first time they see something like U+FFFF. —The preceding unsigned comment was added by 216.135.95.189 (talk • contribs). 8 April 2006
Great work
[edit]Great work on the page. It definitely looks nicer. —Preceding unsigned comment added by 216.135.95.189 (talk • contribs) 8 April 2006
Alt+0### is not a Unicode input method
[edit]The "###" in "Alt+0###" is not a Unicode code point and therefore this is not a Unicode input method. On my US Win2K system it appears to be giving me Windows-1252 characters. For example, Alt+0128 gives me the Euro symbol, and anything above 255 is entered as if I had typed it modulo 256 (e.g., 256=000, 257=001, and so on). This misinformation about it being "Unicode" is repeated on Unicode#Input methods as well. Please research the actual behavior and fix the articles.—mjb 00:40, 1 May 2006 (UTC)
- I've actually seen some systems, with no relevant third-party software, configured in such a way as to allow unicode input (i.e. Alt-9674 = ◊) - This keeps getting repeated everywhere, and I've seen it with my own eyes, so at some point we need to track down what configuration results in this behavior. —Random832 16:53, 17 March 2008 (UTC)
- Two years later, I think this is now explained (albeit somewhat clumsily) in the article: Alt+9674 = ◊ in a RichEdit control (e.g., in WordPad). (Notepad doesn't use the RichEdit control.) --Keith111 (talk) 13:58, 29 March 2010 (UTC)
On Windows 7 (which appears to come with EnableHexNumpad predefined) I've observed that Alt Num+ followed by numeric keypad digits does not work consistently (in some apps it will work ok, in others it will return a different character than expected), while Alt Num+ followed by regular top-row digits does work consistently in all applications (provided the font supports that character). For example, if I type +0128 using the numeric keypad then I will get € in most single-line edit controls, but Ĩ in most multi-line editors (including Notepad), while Num+ and 0128 via the top row always produces the latter. --202.180.125.97 (talk) 08:26, 14 July 2014 (UTC)
Table
[edit]Not sure what the point of a table of the first five alt codes is, espcially since they're CP437 characters... AnonMoos (talk) 12:22, 17 February 2009 (UTC)
Attention
[edit]Please, read details about alt codes I wrote before changing the article not to write language-specific codes and such... 213.130.16.116 (talk) 21:08, 18 February 2009 (UTC)
External link policy
[edit]The general external link policy places some requirements on us, as does the "Wikipedia is not a mirror or a repository of links" policy. In the context of this article, alt code "tutorial" or "reference" sites are a dime a dozen, and pretty much all amount to the same thing: some random person self-publishing some knowledge they cobbled together from who knows where, sometimes not entirely correct and usually limited in scope. The majority of them apparently exist to not just be helpful, but to provide a platform for advertising. None of them are providing any official/authoritative info (e.g. from Microsoft), and they're not anything we're recommending for further reading & in-depth research. Rather, they're just cheat sheets.
However helpful some of these sites may be, I think we have a responsibility to exercise discretion in linking to them. I'm not saying we can't link to any cheat sheets, but we can't link to all of them, so we have to decide which ones are worth linking to. That is, we must have criteria by which we gauge the acceptability of an external link, above and beyond the bare minimum required by the general guidelines; otherwise, the list of sites we're linking to will just grow and grow, violating the policies against indiscriminate linking.
Here's what I consider acceptable, above and beyond the general restrictions in WP:EL (e.g. no blogs):
- The site should not be redundant — if two sites are providing mostly the same info, we should only link to one.
- The site does not contain advertising or other self-promotion — posting a link to your Google Ads-laden site is spam, IMHO.
I have some other ideas, but I think these two criteria, if strictly enforced, will keep this article's external links few in number and high in quality. —mjb (talk) 01:38, 21 June 2010 (UTC)
I found the currently-linked alt-codes.org to be really convoluted - I just wanted a simple list since my v key stopped working. I found alt-codes.net to be much clearer. They both have ads. What do you think? goodeye (talk) 22:36, 21 April 2011 (UTC)
- I don't see any adds on http://alt-codes.org/ (though I wouldn't bother to use or recommend it) TEDickey (talk) 23:52, 21 April 2011 (UTC)
- There are ads down in, just not on the home page. e.g., http://alt-codes.org/list/ where I finally found what I was looking for. I agree about not using it; I found it fairly incomprehensible. That's what led me to go looking, and found alt-codes.net goodeye (talk) 02:06, 22 April 2011 (UTC)
The recent link to a site hosted by google is just a graphic of codes, full of ads. If no objection, I'll be deleting it. goodeye (talk) 15:39, 6 April 2012 (UTC)
- Done. goodeye (talk) 16:28, 7 April 2012 (UTC)
Facts about alt codes
[edit]"This article contains instructions, advice, or how-to content. The purpose of Wikipedia is to present facts, not to train. Please help improve this article either by rewriting the how-to content or by moving it to Wikiversity or Wikibooks. (December 2009)"
But instructions are facts, and this content was useful and exactly what I was looking for. An implicit purpose of Wikipedia is to be useful. IMHO the facts on this page are useful because Windows is widely used. — Preceding unsigned comment added by Alan8 (talk • contribs) 20:31, 21 January 2011 (UTC)
To remove such simple, useful explanations to another site would be to follow mindless dogma. 192.133.129.4 (talk) 18:28, 18 April 2013 (UTC)
Opera?
[edit]is someone trolling? cuz in opera, when i press cntrl shift x, its closing... Ws04 (talk) 12:52, 11 December 2013 (UTC)
Linux LibreOffice input
[edit]The article mentions "In LibreOffice, OpenOffice.org and Inkscape, for example, only the second method works. In GTK only the third method works." I'm using LibreOffice under Debian (Wheezy) here, and all three methods work just fine. — Preceding unsigned comment added by 172.218.32.18 (talk) 21:48, 17 April 2014 (UTC)
Composition exiting after the third digit without releasing Alt
[edit]The article talks about "composition exiting after the third digit without releasing Alt", but I have never managed to enter multiple characters this way. Petr Matas 10:32, 11 August 2014 (UTC)
- User:Spitzak removed it. Under standard traditional MS-DOS and MS-WINDOWS, no character is entered until the Alt key is released... AnonMoos (talk) 07:52, 13 August 2014 (UTC)
Circular reference via Code Page 437
[edit]The character-set table in Code_page_437#Characters is not "Alt codes" as insisted by an editor, because any of the code-pages can be treated in this manner. The logical way forward is presumably to spam every code-page topic into the See also list TEDickey (talk) 01:26, 15 December 2014 (UTC)
- The page should include links to relevant codepages nonetheless. Perhaps a table? — Preceding unsigned comment added by 66.118.194.251 (talk) 13:06, 3 February 2015 (UTC)
See for example To input characters that are not on your keyboard. TEDickey (talk) 10:09, 15 December 2014 (UTC)
Inadequate support for modulo 256
[edit]It is not in dispute that, in Notepad, Alt+306 yields a '2', Alt+307 yields a '3', and Alt+6556 yields a '£'. Nor can it be denied that the decimal code points for these characters are equal, respectively, to 306, 307, and 6556 modulo 256. As I understand the requirement for reliable sources, however, these facts are not sufficient warrant for the implication that Notepad interprets all numbers ≥ 256 modulo 256. Nor does Stack Overflow, referenced in a footnote, make this claim. A second footnote links to a Wikipedia talk page, which is inadmissible as a reference.
I deleted both these references along with the unreferenced claim that "users have accidentally memorized these larger numbers for some characters." My edit was reverted, however, by a respected editor. Is there a reason why I should not restore these deletions?
Peter Brown (talk) 20:36, 27 August 2020 (UTC)
- I have to agree with PMB's analysis. --John Maynard Friedman (talk) 21:10, 27 August 2020 (UTC)
- Simple testing with Notepad will show that every single number you can think of will produce the value mod 256. I guess you can never prove "all" numbers work this way, though it gets pretty unlikely the code is doing something else to show these results. I have no idea what would make a "reliable source", I do feel that users who don't know this but complain about specifically typing in a given number and getting a certain character, is a pretty reliable source.Spitzak (talk) 22:42, 27 August 2020 (UTC)
- I have tried this, entering perhaps 20 numbers. All produced the character specified by the modulo 256 hypothesis. I am convinced. Suppose that I had verified it using 100 numbers. Would this entitle me to claim, in a Wikipedia article, that this is how Notepad works? Of course not — that would be original research. Despite the assertion here, the fact (yes, it is a fact) that mod 256 is used is not easily proven, not to Wikipedia standards.
- What would be a reliable source? An official statement from Microsoft, from people who are actually involved in creating and maintaining applications like Notepad.
- Even a mention in Windows for Dummies would do. In the meantime, it seems reasonable to me to let the statement stand, tagged citation needed, but with a hidden comment saying "this is true, as may easily be demonstrated, but still needs citation". --John Maynard Friedman (talk) 12:48, 28 August 2020 (UTC)
Code pages
[edit]The section History and description is confusing. It mentions the "the ANSI code page" three times in the singular, after introducing ANSI code pages in the plural. One time the reference is to "the newer ANSI code page" — maybe the author means "newest"? If so, why not say so, as well as identifying it by its number? The linked section mentions eight ANSI code pages specifically, without any indication of which is newer.
Again, "... if the if code page is not cp1252 some very common Latin accented letters cannot be entered using this approach." Really? According to Code page 437, "The set includes all printable ASCII characters, extended codes for accented letters ...".
I'm not really familiar with code pages, but I think I'm using 437 and I've had no problem producing accents. I can produce a ú, for example, using code 163. If I were using Code Page 1252, wouldn't it be a £?
Peter Brown (talk) 21:42, 28 August 2020 (UTC)
- Some of the problem is there is a large contingent that refuses to acknowledge the importance of CP437 to the IBM PC and will not allow it to be mentioned without also stating that there were some PC compatibles sold that used a different character set (that mysteriously had all the line drawing characters in exactly the same place as CP437!) This always complicates any discussion about early IBM PC development.
- The second comment is due to the first 256 codes being selected from the code page rather than from Unicode. These are not equal to the first 256 code points in Unicode and if any of those are not in the first 256 they cannot be typed by the extended alt key mechanism. This is mainly a problem for code pages that change a lot of points from ISO 8859-1 such as Cyrillic.Spitzak (talk) 21:55, 28 August 2020 (UTC)
- More importantly, there were many code pages, each targeted at a different market, and 437 is not the primary one just because it was the one targeted at the US market. We can't say "Alt+1234 produces this glyph because modulo 256" when it depends entirely on which code page is installed on the users machine. Yes, the principal audience for en.wiki is Western Europe and its (former) colonies, but it is widely read worldwide because English is the most common second language and en.wiki the most comprehensive. Conversely, per PMB's comment, we can't declare that code page XYZ is mandatory because "it depends". I know WP policy doesn't allow us to say so but the code page model is ancient history, Ms have deprecated it, and we should do nothing to encourage it. --John Maynard Friedman (talk) 12:58, 29 August 2020 (UTC)
- JMF says that "the code page model is ancient history" but the Code page article seems not to concur, with present-tense statements like "The majority of code pages in current use are supersets of ASCII ...". Is this article out of date? Even JMF uses the present tense above: "it depends entirely on which code page is installed on the users machine." I do think that I am (not was) using CP437, but I'm willing to be corrected.
- Peter Brown (talk) 21:35, 29 August 2020 (UTC)
- Maybe I exaggerated a little but seriously, how many users of windows systems since the end of 9x have any awareness of code pages? That article says the Microsoft even tried to delete the facility from Windows but howls of protest from old lags made them leave it in - but is it even covered in mainstream Microsoft documents any more? Present tense is valid because it still exists but is deprecated. I think that we have worked this topic to death so I for one will not contribute to it further. --John Maynard Friedman (talk) 22:42, 29 August 2020 (UTC)
- Peter Brown (talk) 21:35, 29 August 2020 (UTC)
Delete the image at the top of the page?
[edit]It's cute and shows off the potential for dynamic images, but does it really contribute to readers' knowledge? One has to read the legend to understand what it is about, and the lead paragraph, by itself, is rather more informative. In addition, doesn't the image pose a danger to readers with photosensitive epilepsy?
Peter Brown (talk) 19:11, 5 September 2020 (UTC)
Making sense of an undoing "with some attempts at clarification"
[edit]@Spitzak: As you know, you have added to the article the text:
- In early versions of Windows, attempting to type a code in the OEM page (such as a line-drawing graphic) that did not exist in the Windows code page did nothing. The transition to Unicode improved this, as all the codes in both pages exist in Unicode, so they all work.
This is hard to understand.
We need to observe the difference between codes, which in this case are nonnegative integers, and the graphics produced by the entry of codes into a computer. Line-drawing graphics are not codes, so one cannot literally attempt to type them except in the looser sense that one can type a code that corresponds to the graphic either in a code page or in Unicode.
I am not familiar with "early versions of Windows". However, it is unclear what it would be to type a line-drawing graphic "that did not exist in the Windows code page". In the looser sense defined in the previous paragraph, that would be to type a code that corresponds to the graphic, a code that was not defined in the code page. If it was not defined, however, how could it correspond?
One possible interpretation is that the attempt discussed involved entering the code corresponding to a line-drawing graphic according to one code page when a different code page was in use. Thus, code 201 corresponds to ╔ in both CP437 and CP850 but not in CP1252. The "attempt" referred to could be to enter code 201 in such a way as to yield this character in CP1252. While the attempt would fail, the claim that it would do "nothing" is false, as it would produce the character É.
- No, what it would do when 201 is typed is try to find ╔ in the current ANSI code page, not find it, and (I have heard) ignore the Alt code. Perhaps it instead inserts É but that will have to be tested on an old version of Windows. In either case however the user does not get the expeced ╔, so the alt key did not "work". That is what I am trying to describe. If you typed a character from the OEM page that was in the ANSI code page then it does work: for instance if they typed 128 this would be Ç in the OEM code page, which Windows will locate in the ANSI code page at 0xC7 and the code point 0xC7 would be inserted.Spitzak (talk) 17:49, 25 September 2020 (UTC)
"The transition to Unicode improved this," according to your added text, "as all the codes in both pages exist in Unicode, so they all work." What was the improvement? U+00C9 encodes É and U+2554 encodes ╔, but what defect does this address?
- Because ╔ exists in Unicode, so more modern Windows has something to insert, no matter what the ANSI code page is set to. The OEM code page and the ANSI code page are both stored by modern Windows as lookup tables directly to Unicode. If they type 201 then the character U+2554 is found in the map of the OEM code page and U+2554 is inserted and the user sees the expected ╔. If the user types 0201 then the character U+00C9 is found in the map of the ANSI code page and U+00C9 is inserted and the user sees the expected É.Spitzak (talk) 17:49, 25 September 2020 (UTC)
Peter Brown (talk) 17:21, 25 September 2020 (UTC)
- In very early versions of Windows, there were no Windows code pages, just OEM pages, so the text won't do as it stands. How how about this? I'm drawing facts from Code page 437 § History, so I might be accused of forking.
- In early versions of Windows, only the characters available in the OEM page (CP437) were available, using the Alt key with a number less than 256. These included many box-drawing and other non-alphabetic characters. Subsequently, a Windows code page was made available without the box-drawing characters but with many other symbols, retrieved by entering a 0 prior to the number. In some Windows programs, including Word and Wordpad, a large number of characters from Unicode can be produced with entry of four or more numerals constituting the code point in decimal.
- I thought the very first versions of Windows introduced the ANSI code page, but you may be right that was later. If this is true the eariler text needs some fixing as well, as it also implies that ANSI code pages were added with Windows.Spitzak (talk) 01:18, 26 September 2020 (UTC)
- You're right. I was wrong. According to Aivosto,
- The first version of Microsoft Windows, released in 1985, came with a single character set. It was known as the Windows ANSI character set. This character set was quite different from the character set of DOS, the 437. The most notable difference was with line drawing characters, which are missing in Windows.
- If line-drawing characters were not supported at all, however, we can't talk of attempting to create them. What would count as an attempt?
The first version of Windows which was a big commercial success was 3.1, so I'm not sure that version 1.0 (used by a very small number of people) is all that relevant... AnonMoos (talk) 06:24, 26 September 2020 (UTC)
"Much" vs. "legacy" software
[edit]@John Maynard Friedman:According to your recent edit, only "legacy" software interprets codes over 256 modulo 256. Try out some software that you don't consider legacy! Alt+419 yields ú and Alt+0419 yields £ in the Wikipedia edit box and in the password field at https://www.walgreens.com/login.jsp, whether I'm in Firefox (released 2004) or in Chrome (released 2008). These are the CP437 and CP1252 characters, respectively, for code 163, and 163 ≡ 419 modulo 256. Mod 256 lives on! Sicut erat in principio ... well, not really. Peter Brown (talk) 22:33, 28 September 2020 (UTC)
- Maybe it is Windows that is the legacy software :-(
- Ok, I wil revert. --John Maynard Friedman (talk) 23:56, 28 September 2020 (UTC)
- "Many a true word is spoken in jest". Thinking about this for a moment, I would expect a competently designed app to assume the OS will hand over the unicode code-points for whatever the user is typing. The keyboard handler is an OS function, the app should not get involved. (Yes, I understand that DOS apps did that sort of thing but that was before the ark.) So it must be that it is Windows that is doing the modulo 256! It is certainly doing the code-page/Unicode transformation. --John Maynard Friedman (talk) 00:29, 29 September 2020 (UTC)
"Typing the character"
[edit]I deleted the sentence, "Many Wikipedia articles on various characters will include how to type that character using Alt codes for code page 437" I acknowledged that "This is true for 'æ' but not common." Spitzak reverted my deletion, claiming that "Such 'typing the character' sections are in virtually every article on a letter or punctuation mark" but giving no examples.
- Such information is indeed provided for å, ç, £, and ñ.
- There is nothing at all of this sort for ê, û, (, or /.
- For characters in the Latin alphabet and for some punctuation, the ASCII value is given. It isn't mentioned that this same code can be used with Alt. For numerals 0 –9, not even the ASCII code is provided.
- The article for " notes that this can be done but doesn't say how.
Suppose that, instead of deleting the sentence I replace "Many" with "Some". Will that meet with approval?
Peter Brown (talk) 00:10, 3 October 2020 (UTC)
- Sure. I was mostly talking about punctuation marks actually, like division sign.Spitzak (talk) 01:20, 3 October 2020 (UTC)
Alt+some numeral "goes to" some code page.
[edit]@John Maynard Friedman: Please! I am not "actually saying that Alt+099 goes to the OEM CP but Alt+0100 goes to Windows CP." I wish I knew enough about the process to assert that anything goes anywhere. I only claimed that entry of some Alt codes produce some characters.
There is considerable overlap between code pages. I do know that Alt+99, Alt+099, and Alt+0099 each produces a c and that Alt+100 and Alt+0100 each produces a d. I know that many code pages could explain these productions; I certainly do not know which is operative in each case. If I seem to be claiming more, please edit my claims to forestall this implication.
My complaint about the previous text is that it claimed that
- Holding Alt and typing three digits (first one non-zero) would attempt to translate the code from the 8-bit OEM code page (for example, CP850) to a matching glyph in the Windows code page
whereas, in fact, Alt+227 produces a π from the OEM page (CP850 or CP437) without trying to "translate" anything; π isn't even present in CP1252. Nor are three digits necessary: Alt+19 produces ‼ which, again, isn't in CP1252.
While I have your attention, I hope that you can elaborate on the final sentence in the section, "The transition to Unicode improved this, as all the codes in both pages exist in Unicode, so they all work." Sure, codes 0 –255 all exist in Unicode, as do codes 256 –200000. How does that ensure that "they all work"? Work in what application? In Notepad I cannot produce Đ at all! Just what did "the transition to Unicode" improve?
Peter Brown (talk) 02:10, 3 October 2020 (UTC)
- @Peter M. Brown:, I've concluded that I just don't know how the process works, so I best I don't try to "correct" anything. Your question would be better addressed to our Japanese friend because the code-pages that we see overlap so heavily. To generalise, we need a counter-example to try to disprove the thesis.
- "Transforms" might be a better word than "translates"?
- For a moment, I thought that Spitzak had finally explained it: that the keyboard handler transforms a key-press to a scan-code and the code-page handler transforms that to a Unicode code-point and presents it to the app, so that modern files (and the display handler) deal only in Unicode. That would work great whether you have a commodity US or RU keyboard – or even some advanced multi-lingual stick-shift with overhead cam and IPA. This model matches my concept of how a modern OS should work. But your observation that modern device-independent apps like Firefox are being treated same as Notepad just blew that away. In a rational design, apps should get nowhere near the keyboard handler. In summary, I haven't a clue and the article is no closer to telling me: it is still in the realm of phenomenology. --John Maynard Friedman (talk) 08:27, 3 October 2020 (UTC)
- @John Maynard Friedman: You wrote, "More info needed: what happens when a user types Alt+0257? Alt+0500?" The article already says that "many Windows applications, including Notepad, interpret all numbers greater than 256 as modulo 256." Isn't that sufficient? I could add that Word and Wordpad return the Unicode characters, though that information is already in Unicode input. Do you think that each of the two articles should contain a link to the other? Peter Brown (talk) 19:15, 3 October 2020 (UTC)
- What we have now is close to divination, scarcely removed from wp:OR. We record what we observe to happen using systems set for western European and colonies. We don't explain why it happens. We don't explain why it is that modern apps are constrained to behave as if running on Windows 3.1 (is Edge equally hobbled?). And it is not generalised for non-Western code-pages (Windows or OEM or both). --John Maynard Friedman (talk) 19:29, 3 October 2020 (UTC)
- @John Maynard Friedman: You wrote, "More info needed: what happens when a user types Alt+0257? Alt+0500?" The article already says that "many Windows applications, including Notepad, interpret all numbers greater than 256 as modulo 256." Isn't that sufficient? I could add that Word and Wordpad return the Unicode characters, though that information is already in Unicode input. Do you think that each of the two articles should contain a link to the other? Peter Brown (talk) 19:15, 3 October 2020 (UTC)
"So they all work" — How does this follow?
[edit]An editor has recast the sentence quoted here. I do not find the new version less problematic than the original, however. Please see my comments in the following section.Peter Brown (talk) 23:09, 3 October 2020 (UTC)
The final sentence in the section Alt code § History and description reads "The transition to Unicode improved this, as all the codes in both pages exist in Unicode, so they all work."
"They" apparently refers to the OEM and Windows code pages, each of which pairs most codes 0 –255 with graphics characters. They do "work" in the sense that entry of a number less than 256 with Alt held down produces a glyph in the operative font from either the OEM or the Windows code page, depending on whether or not the number is prefixed with a zero. Larger numbers are irrelevant as they are not assigned to characters by the code pages. Just what did "the transition to Unicode" improve? Unicode does not seem relevant.
Peter Brown (talk) 18:00, 3 October 2020 (UTC)
- And the explanation needs to sufficiently general so as to encompass other code pages, so as to explain
::Just to confirm: Alt+0163 (with the 0) produces the following results with the various keyboard layouts: Japanese keyboard 」 Microsoft IME 」 Microsoft Pinyin 」 Thai Kedmanee ฃ US/UK £. QED.
- at talk:pound sign#Explanation for Alt keycode 6556?. --John Maynard Friedman (talk) 19:00, 3 October 2020 (UTC)
How does Unicode facilitate producing characters with Alt codes?
[edit]The section Windows contains the text
- Before Unicode was introduced ... [c]haracters that did not exist in both code pages (such as a line-drawing graphic from the OEM page) could not be inserted in software using the wrong code page and either were ignored or produced an unexpected character. The transition to Unicode improved this, as all the characters exist in Unicode and all can all be inserted.
At present, the line drawing character ├ can be inserted by entering Alt+195. This uses the OEM page CP437. If one inadvertently enters Alt+0195, one obtains the "unexpected" character à from CP1252. However, this state of affairs is characterized as the situation before Unicode was introduced. What is the difference now? Every character in either code page can be inserted, just as it could be prior to the introduction of Unicode.
It is unclear how the transition to Unicode improved matters.
Peter Brown (talk) 23:10, 3 October 2020 (UTC)
- I'm unsure how this is confusing. But anyway, imagine a text editor that purposly stores it's text in CP1252. If the user types Alt+0195 then it generates à which can be inserted as that has a code point in CP1252. But if the user types Alt+195 it generates ├, which does not exist in CP1252, so there is nothing the text editor can do to cause it to appear. It is true that the program may insert a completely different character, such as Ã, but the result is the same: the user's attempt to type a ├ did not "work". Any result other than seeing a ├ inserted is a failure.
- A text editor that stores it's text in Unicode, however, has a code point for both of these characters, and therefore can insert both of them, so it is now possible for both sequences to work, one of them produces à and the other produces ├.Spitzak (talk) 18:29, 4 October 2020 (UTC)
- I'm not interested in imagining made-up text editors. I think it significant that the procedure for retrieving characters requires one keystroke more if the source is a Windows code page than if it is an OEM code page. This strongly suggests that the Alt code technique was not first rolled out for Windows pages, with OEM pages as an afterthought — what would be the point of requiring the user to enter a zero if entry without a zero were not at least contemplated? The choice between code pages was most likely there from the beginning, in 1985, and your imaginary text editor never existed.
- The development of Unicode came later, starting in 1988. The following year, 1989, Word for Windows was released. Perhaps, even at that early stage, it had Alt-code support for Unicode code points over 255. In any case, no reason has been given to suppose that the introduction of Unicode improved matters.
- I don't know what Alt codes have to do with it, but before Unicode, only specialized niche programs and certain comprehensive word processors supported a single document containing characters belonging to different codepages, and they did so by means of ad-hoc font-switching. There was no real interoperability between programs. This started to change with the release of Windows NT 3.1 in 1993... AnonMoos (talk) 02:38, 6 October 2020 (UTC)
- The current version of this sentence reads
- Modern software uses Unicode which contains all the characters, and thus both the zero-prefixed and non-zero-prefixed Alt codes all work nowadays.
- As it stands. this is a non sequitur. Perhaps an account of just how modern software uses Unicode would bridge the yawning gap between
- The antecedent: "Modern software uses Unicode which contains all the characters" and
- The consequent: "both the zero-prefixed and non-zero-prefixed Alt codes all work nowadays."
- Then I could delete the {{Explain}} template.
- The current version of this sentence reads
- Still not sure where the confusion is, but:
- "work" means: the user sees the character they expected to result from typing the Alt code.
- There are about 350 different characters that can be produced by both the OEM and Ansi Alt codes (a lot of characters are in both sets so not 512)
- "Non-modern software" used something other than Unicode. In many cases this something could only store at most 256 different characters, therefore some subset of the 350 different possible Alt codes could not be stored in it.
- If the user typed an Alt code for a character that the software could not display, than something else would happen other than that character appearing, and therefore the Alt code did not "work".
- Unicode-using software can store all those 350 different characters (and many others), so it is possible for all Alt codes to "work".
- Spitzak (talk) 22:21, 6 October 2020 (UTC)
- Still not sure where the confusion is, but:
- Please:
- Your definition of "work" includes reference to the user's expectations. If a user types Alt+0339 expecting to see a œ but what is produced is an S because the software interprets the input modulo 256, did the Alt code not work? It did precisely what it was supposed to do, given the software.
- Yes that is what I am using as the definition of "work". In your example the Alt code does not "work".Spitzak (talk) 19:26, 7 October 2020 (UTC)
- What makes software "Unicode-using"? And what is it for software to use "something other than Unicode"? In general, what is it for software to use something in this sort of context?
- Peter Brown (talk) 04:14, 7 October 2020 (UTC)
- Does the software store the text the user types as Unicode, or some other encoding. The other encodings of interest are "code pages" which mostly can only store 256 different characters and thus it is impossible for them to store all the possible Alt code inputs.Spitzak (talk) 19:26, 7 October 2020 (UTC)
- An Alt code consists of 1 –8 decimal digits, each in the range 0 –9. Going back to the '60s, these were mostly stored in ASCII, which is a subset of Unicode; it follows that they were stored in "Unicode", though the term hadn't been invented yet. But the digits do need to be stored one by one, perhaps in ASCII, with the Unicode code points greater than 3916 coming into the picture somehow — explanation required — when the Alt key is released. I don’t see this as matter of how the software "store[s] the text". Peter Brown (talk) 21:18, 7 October 2020 (UTC)
- Except that MSDOS 'borrowed' 0-31 for other things so, although the file contained a number, the 'meaning' associated with that number was not its meaning as defined in ANSI's ASCII standard. And of course 128-255 was gun-slinger country. "Extended ASCII" wasn't standardised and code pages (especially those with box drawing characters) were wildly at odds with the ISO standards for the same code-points. And of course MS played fast and loose with 128 to 159, leading to OS-dependent file contents as I raised previously. So I suggest that you are walking us into a semantic trap if you try to retrofit Unicode. I'd turn back if I were you, Dorothy.
- Coming back to Spitzak's "how the software stores the text", what really happens is that the software (or today, the file handler) stores a bit string. What that bit string 'means' was app dependent as given by the TLA (.doc, .db2, etc). Clicking on the file invoked the associated app and that either 'just knew' (because it was only ever sold in the US and Canada, or just took it on faith that the code page used to write it was the same code page used to read it) or had some sort of header to tell it which code page to use. --John Maynard Friedman (talk) 16:37, 8 October 2020 (UTC)
- An Alt code consists of 1 –8 decimal digits, each in the range 0 –9. Going back to the '60s, these were mostly stored in ASCII, which is a subset of Unicode; it follows that they were stored in "Unicode", though the term hadn't been invented yet. But the digits do need to be stored one by one, perhaps in ASCII, with the Unicode code points greater than 3916 coming into the picture somehow — explanation required — when the Alt key is released. I don’t see this as matter of how the software "store[s] the text". Peter Brown (talk) 21:18, 7 October 2020 (UTC)
- Does the software store the text the user types as Unicode, or some other encoding. The other encodings of interest are "code pages" which mostly can only store 256 different characters and thus it is impossible for them to store all the possible Alt code inputs.Spitzak (talk) 19:26, 7 October 2020 (UTC)
- John, you've introduced the phrase "the file" for the first time on this entire talk page. It would be helpful to know what file you are referring to. One of these uses is in the context "clicking on the file", which is doubly obscure: some applications represent files as icons that can be clicked on, but I didn't think that these were under discussion.
- Spitzak, Wikipedia:Talk page guidelines § Editing others' comments discourages interleaving responses with another editor's material. I've done it myself, when there were sixteen points to which I wanted to respond and, in that case, I colored my responses red to set them apart. When there are only two, it's unnecessary and confusing. I had to spend time figuring it out.
- The text with the {{Explain}} tag currently reads:
- Modern software uses Unicode which contains all the characters, and thus both the zero-prefixed and non-zero-prefixed Alt codes all work nowadays.
- Spitzak's definition of "work" is ideosyncratic and not, I suggest, what is meant in the tagged sentence. If software interpreting Alt codes modulo 256 produces a S in response to Alt+0339, I would say that it worked, regardless of the user's expectation, and I'm sure that the average reader would agree.
- The text with the {{Explain}} tag currently reads:
- Yes, "Modern software uses Unicode" and yes, Unicode "contains all the characters." Does it follow that "both the zero-prefixed and non-zero-prefixed Alt codes all work [in the normal sense] nowadays"?
- No. Word, in Microsoft Office 2019, is surely modern software. In this application Alt+42700 and Alt+042700 do not work, nor do any other codes in the Bamum Unicode block. Q.E.D.
- Peter Brown (talk) 20:40, 8 October 2020 (UTC)
- Files: heretofore [I always wanted a good excuse to write that], we have have been using the rather loose expression "the software stores the text". So where does it store it, except in the a file (unless it is going to process it first, as in Excel, but the result has to be stored somewhere, even if it is discarded on exit). But even if it holds it in memory, it is still a bit string and 'meaning' is assigned by the application. Pre-Unicode-aware-systems, an arbitrary code-point in the 0-255 range could be (and was) interpreted as Latin, Cyrillic, Pinyin or Kana according to the user's current software and OS. I thought Spitzak said that nowadays Windows only uses the code-page to determine the user's intent at input time but always stores the unique Unicode values in the file/memory: that if true would be eminently sensible and obvious.
- Whether the file is presented as fully qualified file name, a simple filename with the TLA hidden, or a pretty icon, is just a presentation issue that doesn't change the fact that there is a real file or database structure holding actual data stored as a bit stream. There are two ways [that I know of] to read a file: click on the file to invoke the default app declared for that TLA, or start the app and point it at the file that you want it to process. (e.g., for a .jpg, the default may be MsPaint but this time I may want to use Photoshop instead because I want to do some heavy engineering on it.) --John Maynard Friedman (talk) 21:06, 8 October 2020 (UTC)
- Peter Brown (talk) 20:40, 8 October 2020 (UTC)
Interpreting interpretation
[edit]The article currently reads "Some applications ... interpret many Alt codes larger than 255 directly as decimal Unicode code points." Spitzak has added the text "this is unrelated to whether the software can actually display those code points" and deleted my claim that these applications fail to interpret a letter in the Bamum script.
First off, it must be acknowledged that machines can be said to interpret only metaphorically, except when a process converts instructions into actions without prior compiling. The present sense differs in that the inputs are not instructions.
What meaning can be attached to the statement that an Alt code is interpreted as a Unicode code point? On a verificationist view, the meaning of a sentence is given by the conditions under which it can be verified or falsified. If software produced a Ω whatever Alt code was entered, we would say that no, it was not interpreting the codes as Unicode code points. If it produced a glyph corresponding to the code point for every Alt code 256 –2000, we would take that as evidence that the software interprets Alt codes larger than 255 as decimal Unicode code points. Perhaps Spitzak rejects all forms of verificationism; if so, his claim that whether applications interpret Alt codes as code points is unrelated to whether the software can display the associated glyphs is a personal opinion and should be flagged as such.
On my part, I should not have implied that the failure of Alt+42700 to render a refutes the statement that the named applications interpret the Alt code as a Unicode code point. I should have qualified the claim with "perhaps" or something of the sort. It may be, after all, that the software produced a precise rendering of the character in PNG or SVG but some failure of the glyph-rendering mechanism resulted in a ⍰.
Peter Brown (talk) 18:18, 10 October 2020 (UTC)
- As Spitzak said in their edit note, the issue here is whether or not the user has a suitable font. If there is no font with a code-point for that glyph, the fallback font last resort "tofu" will be shown, which is probably what you got. So the app may request the OS to display the Klingon Ǵřġ'Ħƕ code-point, the code is valid Unicode [for the purposed of argument], but the OS doesn't have Klingon.ttf so no can do. It is not a rendering fault "no fault found, the system is working as designed". Alternatively, Word barfs at anything higher that 9FFF, so your 42000 (A6CC) is out of bounds but, if so, that is a limitation in a specific app, not a failure of the principle. BTW, I can see your 42000/A6CC because my chromebook has the almost everything Noto fonts.--John Maynard Friedman (talk) 19:02, 10 October 2020 (UTC)
- But fundamentally it comes back to my question all along: why does Windows fail to interpret Alt+0ddddd as an instruction to store exactly that code point? What were they thinking? --John Maynard Friedman (talk) 19:11, 10 October 2020 (UTC)
- My issue is not fonts but how "interpret" is to be understood. Dictionary definitions mostly define the term as something that people do; the exception is to define it, to quote Wiktionary, as "To analyse or execute (a program) by reading the instructions as they are encountered, rather than compiling in advance." That is inapplicable, as numbers are not instructions in any programming language I know of. Can you propose a better definition, one that works in this context?
- If, as you suggest, "Word barfs at anything higher than 9FFF", then its inability to handle 42700 has nothing to do with fonts, only with the magnitude of the number, and Spitzak is mistaken — as are you in backing him up . As regards your "fundamental question", what is your reason for thinking that "Windows fail[s] to interpret Alt+0ddddd as an instruction to store exactly that code point?" Before barfing, that is. Perhaps Windows correctly stores 42700 as A6CC, realizes that the number exceeds 9FFF, and then barfs, spewing out a ⍰ in its disgust. That would not be a failure of interpretation.
- Peter Brown (talk) 21:45, 10 October 2020 (UTC)
- In this context, interpret means to parse a the bit string as utf-8 and thence derive a Unicode code-point. We could have interpreted it as a jpg image, an MP3 music file or an MP4 video etc etc, but for now let's assume we have a text string. Theoretically that code-point may not have a defined glyph but again let's assume the general before the exceptional. That is what I mean by interpret. The Wiktionary definition is way too limited, I think only BASIC works that way. Btw, machine code is a numeric programming language.
- I don't suggest that Word barfs at numbers greater than 9fff, only that it is a possible if unlikely explanation that would fit the evidence. It can be disproved easily. It would only add to the improbability of that explanation that it would be programmed to know its limits but then fail to give a sensible error message. The 'tofu' block is afik always the result of your system not having any font with a glyph for that code-point. That is standard behaviour of every OS and web-browser I know, so Occam's razor time. It is a font issue. Like I say, I can see your glyph on Chrome OS, so it has been created and transferred correctly to Wikipedia's servers.
- For my fundamental question, see talk:pound sign where I tried to tell our Japanese friend to ignore the evidence of their own eyes.
- --John Maynard Friedman (talk) 23:40, 10 October 2020 (UTC)
- I point out that entry of 42700 generates five bit strings, one per digit. If each were held in an eight-bit byte and they were concatenated, that would be a string of 40 bits. How is this parsed as UTF-8?
- Should I have known that ⍰ is always the result of one's system lacking a font with a glyph for that code-point? That is surely not a "sensible error message"! I believe you, but this needs to be better documented. Tofu (disambiguation) only shows � and ␚. So ⍰ (U+2370) should be listed to help us dummies.
- Incidentally, only PSPad gave me a ⍰. Word and Wordpad each gave me a space — that is, the cursor advanced to the right without displaying a character. Should that have alerted me to a font problem?
- According to the article UTF-16, the encoding "UTF-16 is used internally by systems such as Microsoft Windows, the Java programming language and JavaScript/ECMAScript." According to the article UTF-8, GB 18030 has a 14.5% share in China. Et cetera. Are you sure you want to build UTF-8 into the definition of "interpret"? If so, go ahead and edit Wiktionary.
- UTF-8 (qv) is designed to encode and decode in multi-byte strings, however many are needed to encode the character. As I already said, a bit string could be a picture, a piece of music, a video, a genome, whatever. UTF is just one of many ways to interpret a bit string, just as there are many ways to interpret marks on a page or sounds in the air.
- Because of the legacy of Windows, apps do things that elsewhere would be handled by the OS, so you can't expect consistency. I have no idea why Word just moves on, do a bug report. In a rational system, it would tell the OS "here, display this" and the OS would find a font that did so or display the last resort character (which is definitely not SP in any sensible world).
- Jargon like "tofu" and mojibake are (and should be) a long way from most reader's experience. So Wikipedia has the {{Contains special characters}} template (example right) to intercept without getting bogged down in geekspeak. --John Maynard Friedman (talk)
Nonstandard use of "interpret"
[edit]- Your use of the term "interpret", which is pretty much Spitzak's, is quite nonstandard and the term should not be used this way in Wikipedia. Look at the example sentences from The Free Dictionary or Merriam-Webster. All of them involve using some sort of input to produce a output intelligible to humans. The one that comes closest to your usage is from M-W, referring to a blind child trying to learn remotely:
- All of these were inaccessible for Kai — his screen reader couldn't interpret any of them.
- Kai's problem was that his device could not produce a braille or audible output that he could understand. Take him out of the picture and there remains nothing to say.
- Yes, a bit string could be a JPG image. That means it could be rendered as a visible image according to a standard procedure. If the wavelengths so rendered are too long to elicit a response from human rods and cones, though, then there is no image and the software involved does not interpret anything.
- Interpreting a string is not a matter of producing a Unicode code point. As the Unicode Standard says, a code point is simply a number, and numbers are invisible, inaudible, odorless, etc. Only when something visible is produced, something that depends in detail on the code point, can whatever produced the code point be said to have interpreted.
- Your use of the term "interpret", which is pretty much Spitzak's, is quite nonstandard and the term should not be used this way in Wikipedia. Look at the example sentences from The Free Dictionary or Merriam-Webster. All of them involve using some sort of input to produce a output intelligible to humans. The one that comes closest to your usage is from M-W, referring to a blind child trying to learn remotely:
- An exception, I have to admit, is definition #4 at Wiktionary:interpret#Verb. You have noted that this is a rare use, not applicable here.
- Peter Brown (talk) 18:36, 12 October 2020 (UTC)
- You are taking us into reductio ad absurdum territory. Let's look at this more widely: suppose I send you a file that contains elements that are clearly legible in my non-windows system but you can't read it on your PC. Is that a fault of the encoding? or in fact is it not a fault in your Windows system (or your app)? The code is not inherently unintelligible, your system is broken if it doesn't even give you a tofu to tell you there's a problem.
- The case study of Kai, the blind child, is rather different. I expect his screen reader can cope with the range of Unicode glyphs that a K-12 student could expect to encounter. So almost certainly the cause of the unintelligible - to him - material was inappropriate styling, gee-whizz stuff that looks real purdy to sighted readers but blocks those with sight impairment. Kai is not handicapped in the abstract, he is being handicapped by sloppy and thoughtless design: that is the point of the anecdote. --John Maynard Friedman (talk)
- Thank you for the file. Unfortunately, my PC is unable to interpret some of the elements — that is, render them in a form that I (a human) understand. Perhaps you could send me screen shots in a PNG file, instead.
- Kai's situation is analogous. In his case, the deficiency lies in his visual apparatus coupled with insensitivity on the part of his putative educator. I am presented with tofus and he with garbled braille. Comprehension fails, so interpretation does not occur; the two are tightly linked. If there is no comprehension, translation of text into Unicode or UTF-8 is not interpretation, notwithstanding your attempted definition or Spitzak's very similar supposition.
- Your views are not supported by dictionary definitions or by the example sentences. In consequence, I have to maintain that they have no place in Wikipedia.
- Peter Brown (talk) 18:36, 12 October 2020 (UTC)
- Peter Brown (talk) 22:43, 12 October 2020 (UTC)
- Another example: somebody sends me the calculations needed to launch a satellite constellation. The figures, mathematical symbols and diagrams are all entirely valid and a rocket scientist would admire or go, look at this kindergarten error. I, however, am not a rocket scientist. My eyes receive the input but I am unable to interpret what I am seeing, I cannot comprehend it. But at least I know what I don't know, the paper is not blank because my printer couldn't be bothered with those funny Greek letters. (If it had, I would blame the printer, not the rocket scientist or my eyes).
- I have gone as far as I can with this discussion. If you don't like "interpret", find a better word but it will mean essentially the same thing.--John Maynard Friedman (talk) 23:41, 12 October 2020 (UTC)
- See also Epistemology. --John Maynard Friedman (talk) 23:44, 12 October 2020 (UTC)
- I am arguing that interpretation and comprehension are intimately tied. In your example, you say, "I am unable to interpret what I am seeing, I cannot comprehend it." That just reinforces my point.
- I am quite comfortable with the word "interpret". I have no idea why you think that I don't like it.
- I've no idea what you expect me to get out of the Epistemology article.
- Best wishes, Peter Brown (talk) 01:45, 15 October 2020 (UTC)
- You seemed to be moving into epistemological questions like 'the meaning of "meaning"' (or interpretation of "interpret"), that are way above my pay grade. --John Maynard Friedman (talk) 08:56, 15 October 2020 (UTC)
- See The Meaning of Meaning. This is a matter of semiotics, not epistemology. You may wish to reflect on the distinctions between the subfields of philosophy. Peter Brown (talk) 17:23, 15 October 2020 (UTC)
- We have an article for that? I always thought it was a philosophers' joke: how can you say what a word means if you have to define "mean" first. Catch 22. Well, I never! That is SO far above my pay grade. Wikipedia doesn't pay me enough to think deep thoughts so I will continue to take the advice of the White Queen (or was it the Duchess? The Hare?) "It means exactly I want it to mean, no more and no less." --John Maynard Friedman (talk) 18:05, 15 October 2020 (UTC)
- None of the above. It was Humpty Dumpty. Cheers, Peter Brown (talk) 18:41, 15 October 2020 (UTC)
- We have an article for that? I always thought it was a philosophers' joke: how can you say what a word means if you have to define "mean" first. Catch 22. Well, I never! That is SO far above my pay grade. Wikipedia doesn't pay me enough to think deep thoughts so I will continue to take the advice of the White Queen (or was it the Duchess? The Hare?) "It means exactly I want it to mean, no more and no less." --John Maynard Friedman (talk) 18:05, 15 October 2020 (UTC)
- See The Meaning of Meaning. This is a matter of semiotics, not epistemology. You may wish to reflect on the distinctions between the subfields of philosophy. Peter Brown (talk) 17:23, 15 October 2020 (UTC)
- You seemed to be moving into epistemological questions like 'the meaning of "meaning"' (or interpretation of "interpret"), that are way above my pay grade. --John Maynard Friedman (talk) 08:56, 15 October 2020 (UTC)
- Peter Brown (talk) 22:43, 12 October 2020 (UTC)
- The word "interpret" is certainly used in computer science for exactly the way we are using it, an obvious example is interpreted language. In no way does the word "interpret" mean "the output is something humans can read". In fact when working with computers it almost always means "the input is something humans can read". To be truthful, AI is nowhere near powerful enough yet so that there is any real research or work on interpreting computer data into human-understandable data, other than brute transcription.
- If the Alt code inserts a character into the text, imho it "works". The fact that the software fails to draw this character on the screen has NOTHING to do with Alt codes "working", especially if all other methods of inserting that same character also cause the display to fail.Spitzak (talk) 20:57, 15 October 2020 (UTC)
- The article Interpreted language does not provide a "obvious example" as the word "interpret" does occur anywhere in it — though of course "interpret" is the root (linguistics) of "interpreted".
- We need a definition of "work" that is not jargon — one that isn't limited to computer experts. The best approach is surely to consult dictionaries; that's what they're for. Earlier in this thread you said:
- "work" means: the user sees the character they expected to result from typing the Alt code.
- In your latest post, however, you say that an Alt code works if it inserts a character (any character?) into the text, by which I think you mean a text file. This would not require a user to see anything. Please make up your mind and identify a dictionary in substantial agreement, if "work" in the relevant sense is not to be considered MOS:JARGON.
- Peter Brown (talk) 01:32, 18 October 2020 (UTC)
- No, Alt+code is an input operation. So if Alt+nnnnnnnnnnnnnnnnnnnnnnnnnn succeeds in inputting the equivalent [according to Microsoft] code point for the OS or application to process, then it has worked. Whether or not I have a the means to output (dislay, print, emboss, sound, vibrate, stink) that code-point is a different issue entirely. A third issue, again quite distinct, is the mapping [aka translation] between nnnnnnnnnnnnnnnnnnnnnnnnnn and a standardised Unicode code-point --John Maynard Friedman (talk) 17:31, 18 October 2020 (UTC)
- Surely, what it is for something to "work" depends on one's interests. What is it for an automobile's brakes to work? Is it for them to stop or slow the vehicle? Or is it to produce substantial friction between the brake pads and a disc or drum mechanically linked to two or more wheels? The driver and the auto mechanic will see matters differently.
- According to https://www.alt-codes.net,
- IBM developed a method to place the characters that can not be typed by a keyboard on the screen: while keeping the Alt key down, typing the code defined for the character via the numeric keypad.
- I don't claim that this is historically accurate, but suppose that is. Then the intended function of an Alt code was apparently to place non-keyboard characters on the screen, and IBM may have directed its programmers to develop this function without specifying many details. I cannot see that the function has changed since then, though the method developed may now serve other purposes and the set of non-keyboard characters available has exploded. For something to work is for it to be functional, according to Webster's College Dictionary from Random House Kernerman, available from The Free Dictionary.
- Peter Brown (talk) 19:46, 21 October 2020 (UTC)
- A car's brakes are defined to 'work' if pressing the pedal causes the brake shoes to be applied, which slow then stop the wheels rotating. Whether or not the car stops in time to avoid a crash is "out of scope". The brakes worked. The claimed IBM definition, if true, is a very sloppy specification, especially for early DOS screens where character generation for display was an immutably hard coded. --00:30, 22 October 2020 (UTC)
- Peter Brown (talk) 19:46, 21 October 2020 (UTC)
Best known?
[edit]As of today, the lead(!) of the article claims that alt-codes are the best known or even only way to enter off-keyboard characters. Maybe it's because we have AltGr where I live - which limits the number off-keyboard characters ever needed - but in my experience most people use the charmap app or the Insert Symbol in MsOffice apps for anything more. Certainly that is what students are taught. No doubt in certain subject areas have frequent need for particular characters that it is worth learning their alt codes but this case is the exception rather than the norm. --John Maynard Friedman (talk) 08:09, 31 August 2021 (UTC)
- It was the best-known way in the United States during the 1980s and 1990s. (The Windows 95 / 98 / ME Character Map Accessory did not go beyond character 255, and did not include Code Page 437 characters). AnonMoos (talk) 06:41, 3 September 2021 (UTC)
- Change sentence to "For users with early versions of MsWindows and particularly for those with US keyboards, it was ...."? --John Maynard Friedman (talk) 07:32, 3 September 2021 (UTC)
- I don't think even that is true. I'm old enough to remember this, and using alt codes was very much a DOS thing. (Although they still work of course.) My father still knew a few commonly used ones by heart, but I (and people roughly my age and younger) never really bothered with them because by that time we got a version of WordPerfect for DOS that had a compose key, I think it was ^V. The first version of Windows I ever really used was 3.1 and it had dead keys, which I found much more comfortable to use. (Maybe the version of DOS we used could have done that too, but it might have been misconfigured.) The above-mentioned limitation of Charmap wasn't really a limitation in practice, because until Word 97 came out you likely didn't use any software that supported Unicode. (Did you know that WordPerfect was never updated to get Unicode support?) Before that, software such as WordPerfect that supported characters outside of the Windows code page provided their own dialogue box for this (WP 5.1: ^W for WP character sets), as did Word 97 of course. — Preceding unsigned comment added by 77.61.180.106 (talk) 00:33, 13 October 2021 (UTC)
- Thank you. As the assertion has failed to find a supporting citation since August and fails WP:LEAD (it is not a summary of any body content), I have deleted it. Of course if a citation can be found even at this late stage, reinstate at will. Otherwise it is just an expression of the personal experience of a few people and thus WP:OR. --John Maynard Friedman (talk) 08:29, 13 October 2021 (UTC)
- I don't think even that is true. I'm old enough to remember this, and using alt codes was very much a DOS thing. (Although they still work of course.) My father still knew a few commonly used ones by heart, but I (and people roughly my age and younger) never really bothered with them because by that time we got a version of WordPerfect for DOS that had a compose key, I think it was ^V. The first version of Windows I ever really used was 3.1 and it had dead keys, which I found much more comfortable to use. (Maybe the version of DOS we used could have done that too, but it might have been misconfigured.) The above-mentioned limitation of Charmap wasn't really a limitation in practice, because until Word 97 came out you likely didn't use any software that supported Unicode. (Did you know that WordPerfect was never updated to get Unicode support?) Before that, software such as WordPerfect that supported characters outside of the Windows code page provided their own dialogue box for this (WP 5.1: ^W for WP character sets), as did Word 97 of course. — Preceding unsigned comment added by 77.61.180.106 (talk) 00:33, 13 October 2021 (UTC)
- Change sentence to "For users with early versions of MsWindows and particularly for those with US keyboards, it was ...."? --John Maynard Friedman (talk) 07:32, 3 September 2021 (UTC)
- Once again, there is proof that this is the only method a lot of people know to type in characters, right here in Wikipedia. If it was not the only method some people know, there would be no demand for a table of Alt codes since nobody would need to refer to it.Spitzak (talk) 17:53, 18 March 2022 (UTC)
- And once again, where is this proof that "it is the only method that a lot of people [sic] know"? How many is "a lot"? Says who? Wikipedia is no a reliable source. The fact that an article exists just proves that there was at least one person who cared enough to write it. Is the article being read thousands of times an hour? --John Maynard Friedman (talk) 09:18, 20 March 2022 (UTC)
- The proof is the almost instantaneous complaint that the numbers are missing if any attempt is made to remove them. If people knew of an alternative preferable method to insert the characters, they would not be interested in the values and would not detect the removal of them so quickly.Spitzak (talk) 16:35, 20 March 2022 (UTC)
- No, that just proves that there are a number of editors who are watching this page, who want to be sure that it actually delivers the goods, and that that your reformatting of the table doesn't (as it is beginning to seem) violate MOS:ACCESS. Per template added above, the typical number of page views per day is 250, which is just background radiation. --John Maynard Friedman (talk) 17:26, 20 March 2022 (UTC)
- The proof is the almost instantaneous complaint that the numbers are missing if any attempt is made to remove them. If people knew of an alternative preferable method to insert the characters, they would not be interested in the values and would not detect the removal of them so quickly.Spitzak (talk) 16:35, 20 March 2022 (UTC)
- Back when the alt codes were readily available in CP437 and CP1252, I frequently used them when I wanted a non-keyboard character. I don't have the AltGr key. I could type the Unicode value into Wordpad followed by Alt+X and then copy and paste the result into my target document — that was a known technique — but that would involve far more keystrokes, so alt codes were not the best known or only known way to produce off-keyboard characters. I had the Wordpad technique, but I didn't use it when alt codes produced the desired result more easily. Peter Brown (talk) 23:12, 20 March 2022 (UTC)
- If you use the US International keyboard setting, your right-hand Alt key becomes a AltGr key. Or you could install a compose key TSR. Or use TeX/LaTeX. Seriously, does anybody under the age of about 50 use this technique? --John Maynard Friedman (talk) 08:53, 21 March 2022 (UTC).
- Be nice to us folks over 50. When I have a chance, I'll look into these approaches; would each of them provide a way of producing all of the characters in CP437 and CP1252? And why are alt codes the only Windows approach recommended for dashes and quotation marks by "Dash § Typing the characters" and "Quotation marks in English § Typing quotation marks on a computer keyboard" if there are better techniques? Peter Brown (talk) 14:28, 21 March 2022 (UTC)
- It's a vendetta (I'm horrible to myself too).
- And autocorrect of course. For mdash, type --; for ndash type space-space. Autocorrect will change " to curved double-quote even if you wanted double prime. It works for all of the people most of the time, most of the people all of the time, but not all of the people all of the time. This, I would argue, is the technique best known.
- No, US International certainly won't do them all on MsWindows (more useful OSs are available). 'Compose key' will do all or nearly all Latin alphabet symbols and their diacritics. If you are studying a non-Latin foreign language (Greek, Cyrillic, Devangari etc etc) at University, you will be expected to acquire an appropriate keyboard and input method editor (if appropriate). If you are doing serious science or mathematics, you have to learn LaTeX because no conventional word processor will do the markup you need; alt-codes are not even in the building.
- So yes, Alt-codes still have a role if you have an occasional need for an obscure symbol and can't be bothered to set up an autocorrect sequence for it. But not if you if you are routinely skiing off-piste. --John Maynard Friedman (talk) 17:45, 21 March 2022 (UTC)
- Be nice to us folks over 50. When I have a chance, I'll look into these approaches; would each of them provide a way of producing all of the characters in CP437 and CP1252? And why are alt codes the only Windows approach recommended for dashes and quotation marks by "Dash § Typing the characters" and "Quotation marks in English § Typing quotation marks on a computer keyboard" if there are better techniques? Peter Brown (talk) 14:28, 21 March 2022 (UTC)
- If you use the US International keyboard setting, your right-hand Alt key becomes a AltGr key. Or you could install a compose key TSR. Or use TeX/LaTeX. Seriously, does anybody under the age of about 50 use this technique? --John Maynard Friedman (talk) 08:53, 21 March 2022 (UTC).
- And once again, where is this proof that "it is the only method that a lot of people [sic] know"? How many is "a lot"? Says who? Wikipedia is no a reliable source. The fact that an article exists just proves that there was at least one person who cared enough to write it. Is the article being read thousands of times an hour? --John Maynard Friedman (talk) 09:18, 20 March 2022 (UTC)
Errors in table
[edit]The table contains many errors, in particular the 0 is missing from a lot of CP1252 codes. I think also all ASCII that is commonly available (ie the letters, digits, and period and comma and space) can be removed.Spitzak (talk) 17:56, 18 March 2022 (UTC)
- The tables in the articles Code page 437 and Code page 1252 used to contain the alt codes; these were far more convenient than the present table in Alt code, with the added advantage of being accurate. Early this year the numbers in Code page 437 were restored in this version by Gonnym but Spitzak deleted them a few days later. Peter Brown (talk) 23:42, 18 March 2022 (UTC)
- Once again this indicates that adding a "this is often the only way people know how to enter characters" to this article is needed, as the desire to have a table of the alt codes is proof of this fact.Spitzak (talk) 16:27, 19 March 2022 (UTC)
- Alt codes are in the row headings at the left of the 437 table, as well as being in the tooltips. As alt codes are totally useless on mobile the fact that they can't be seen on mobile should be irrelevant. They are also in the tooltips of the CP1252 table but they are not in the row headings. I think they could be put in the table if the insistence that the Unicode code points be visible on mobile was removed.Spitzak (talk) 16:30, 19 March 2022 (UTC)
- Not sure I really understand the issue. The original table had alt codes showing without a tooltip, you then replaced the table and added the tooltip. Now you are complaining about the tooltip? Also, please see MOS:NOHOVER Gonnym (talk) 19:37, 19 March 2022 (UTC)
- I have checked out tooltips on both pages using Firefox, Google Chrome, and Microsoft Edge. Although the alt codes are present in the wikitext, they only show up in tooltips in the range Alt+176 to Alt+223 in CP437. I expect that this is a problem with Template:chset-cell1, which I am far from competent to fix. Notwithstanding MOS:NOHOVER, I would be content if the alt codes actually were accessible in tooltips. Failing that, I hope that they can be provided in the table cells.
Peter Brown (talk) 01:33, 20 March 2022 (UTC)
- I have checked out tooltips on both pages using Firefox, Google Chrome, and Microsoft Edge. Although the alt codes are present in the wikitext, they only show up in tooltips in the range Alt+176 to Alt+223 in CP437. I expect that this is a problem with Template:chset-cell1, which I am far from competent to fix. Notwithstanding MOS:NOHOVER, I would be content if the alt codes actually were accessible in tooltips. Failing that, I hope that they can be provided in the table cells.
- Not sure I really understand the issue. The original table had alt codes showing without a tooltip, you then replaced the table and added the tooltip. Now you are complaining about the tooltip? Also, please see MOS:NOHOVER Gonnym (talk) 19:37, 19 March 2022 (UTC)
- The tooltip is working in all character cells in CP437 for me. I suspect you are being confused by the Unicode code point number that others insisted had to be visible. This sort of confusion is why I would prefer there to be no numbers visible, instead putting them in the tooltip where more text can be used to disambiguate them.
- In any case this article is a much better place for a table, as it is a far more likely page a user will look at if they attempt to find the table.Spitzak (talk) 16:32, 20 March 2022 (UTC)
- Actually the problem is you need to point outside the character, near the edge of the cells, for the tooltip to work for cells with links. IMHO Wikipedia should be fixed so the tooltip takes precedence over link previews in the tables. Or the links could be removed.Spitzak (talk) 16:34, 20 March 2022 (UTC)
- Thanks, Spitzak! I wish I had known that a loong time ago! Yes, somebody who knows how please make the fix! I have updated the articles with the information. Peter Brown (talk) 22:32, 20 March 2022 (UTC)
convert special characters found by Wikipedia:Typo Team/moss (via WP:JWB)
[edit]@Beland:, would you explain your edit today that deleted valid content from this article, which Spitzak and I have had to spend time to reinstate. Are you running a bot that will delete it all again tomorrow? --John Maynard Friedman (talk) 22:31, 9 April 2022 (UTC)
- @John Maynard Friedman: Ah, sorry about the dropped characters. © and ® are usually violations of MOS:TMRULES which is why my script removes them by default. It's not a bot; I manually inspect each edit. There were a lot of changes in the diff I was looking at and I was distracted by not preventing other breakage, so I missed these deletions. I do have a database scan that enforces the trademark symbol guideline so this page would show up on my todo list eventually with no further action (though not for a while since there is a queue of tens of thousands of violations). Normally with character-encoding lists and whatnot I either mark the page for skipping or tag the characters that should not be changed (and I'm usually better about spotting them). In this case, I see one instance has already been tagged with {{char}}, which will prevent the script from complaining about it. I tagged the others with {{not a typo}} which should prevent this page from showing up in the queue again. Sorry again for the mistake, and thanks for the keen eyes and quick fixes! -- Beland (talk) 22:59, 9 April 2022 (UTC)