Jump to content

Wikipedia:Reference desk/Archives/Computing/2021 June 10

From Wikipedia, the free encyclopedia
Computing desk
< June 9 << May | June | Jul >> Current desk >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


June 10

[edit]

How to copy all file & folder names to a notepad txt file?

[edit]

I have 10 folders within it 5 files in each folder. How to copy all file & folder names to a notepad txt file? This video only explains about file names, I want folder names also.Rizosome (talk) 13:33, 10 June 2021 (UTC)[reply]

Rizosome, are you in Windows? Windows 10? A little help please. Elizium23 (talk) 13:36, 10 June 2021 (UTC)[reply]

@Elizium23: Windows 10. Rizosome (talk) 13:52, 10 June 2021 (UTC)[reply]

From a cmd prompt dir /s /b redirected to a text file, might be what you want. Mitch Ames (talk) 13:55, 10 June 2021 (UTC)[reply]
Open "Powershell", and use the ls command. For example, to do this for all the folders in "documents", do this: "ls -r -Name .\Documents\". You can then copy-paste the result into notepad. El sjaako (talk) 13:56, 10 June 2021 (UTC)[reply]
dir /b /s >MyInfo.txt should do it.--Phil Holmes (talk) 13:58, 10 June 2021 (UTC)[reply]

@Mitch Ames: It worked, but I don't want repeating folders. I have 1 folder within it there 3 files. Your command gives me 4 results: 1 folder + 3 files.

  • Wikipedia
  • Wikipedia\Spam.mp4
  • Wikipedia\Spam 2.mp3
  • Wikipedia\Spam 3.txt

I want it to show only 3 results like 3 files in that folder:

I'm pretty sure there's no getting around that using the dir command. I use it all the time and just open the file into Notepad to clean them up manually. Maybe Powershell can do it; my experience with it is limited. Matt Deres (talk) 14:18, 10 June 2021 (UTC)[reply]
The find command filters out the bare directories — actually it filters out the entries without a period in them, which is not the same thing. It won't work correctly if the folder has a period in it or the file has no extension (both of which are valid scenarios), eg "my.folder" or "thefolder\myfile" Mitch Ames (talk) 00:03, 11 June 2021 (UTC)[reply]
True, but nobody would be so evil that they added a period to a directory name or dropped the suffix from a filename on a Windows system. However, using the /a-d option is, indeed, much more correct — GhostInTheMachine talk to me 09:10, 11 June 2021 (UTC)[reply]
I always knew that the .NET Framework was evil (C:\Windows\Microsoft.NET\ etc), but Git (something\.git\) is not so bad. Mitch Ames (talk) 09:29, 11 June 2021 (UTC)[reply]

@Matt Deres: Thanks, what's the proper term for this "repeating folders" ? Rizosome (talk) 14:21, 10 June 2021 (UTC)[reply]

@Rizosome: I doubt it has a special term; what you've got is a list of pathways, some of which end with a filename and some of which end with a folder name. When I use /dir/ it's to create file links in Excel, so - like you - I also do not want the pathways ending with folders, but I do want the folders to be included in the file pathways (as in your example). If you have hundreds or thousands of files to sort through, you could try inputting the txt file into Excel and using Excel's "text to columns" and other functions to clean up the list, but you'll need to be very careful of how you instruct Excel to delimit the data (also, you'll want to open the file in Notepad first to remove the header and footer info). If you really only have 50 files, it'll be far faster and reliable to just do all the work in Notepad. Matt Deres (talk) 14:36, 10 June 2021 (UTC)[reply]

@Elizium23: I want Folder names but CMD repeating single folder again in the output if it contains files in it. Rizosome (talk) 14:23, 10 June 2021 (UTC)[reply]

I don't think the DOS DIR command is all that smart. But if you enter "dir /?" you can see all possible options. ←Baseball Bugs What's up, Doc? carrots14:26, 10 June 2021 (UTC)[reply]
If you were willing to install cygwin, you could do it with find . -type f CodeTalker (talk) 15:50, 10 June 2021 (UTC)[reply]

Proper handling of UTF-8 byte order mark

[edit]

I'm a bit confused about how the byte order mark (BOM) actually works in UTF-8. Given two identical texts differing only by the fact that one begins with a BOM, are they otherwise equivalent or will the subsequent octets of each codepoint be in reverse order? Earl of Arundel (talk) 14:58, 10 June 2021 (UTC)[reply]

The BOM's main purpose is to show the byte order in UTF-16 files. As our byte order mark article says, "Byte order has no meaning in UTF-8". So the presence or absence of the BOM has no effect on the remainder of the contents of a UTF-8 file. CodeTalker (talk) 15:54, 10 June 2021 (UTC)[reply]
Got it! Thanks for the confirmation. Earl of Arundel (talk) 20:25, 10 June 2021 (UTC)[reply]
But, as the article states, "Its presence interferes with the use of UTF-8 by software that does not expect non-ASCII bytes at the start of a file". That would be likely the case for legacy software. In writing UTF-8 compliant code, the three-byte BOM in input should simply be ignored, but preferably not be generated in output for no specifically compelling reason.  --Lambiam 21:01, 10 June 2021 (UTC)[reply]
Right, may as well just strip it out. It's typically not used anyway, so no real need to support that corner case. Earl of Arundel (talk) 01:54, 11 June 2021 (UTC)[reply]
For what it's worth, I recently learned that its presence can also interfere with the Terminal program here on this Mac, in that if I should ever try to display a text file containing a UTF-8 BOM immediately followed by a Tab, the Terminal program crashes. (Any assistance reporting this bug to Apple greatly appreciated. :-) ) —scs (talk) 00:13, 11 June 2021 (UTC)[reply]
That'd be an error on the decoder's part. Malformed input should not crash a program... Earl of Arundel (talk) 01:54, 11 June 2021 (UTC)[reply]
Oh, unquestionably! I was reasonably stunned when it happened, and again when I eventually determined what the cause was. As a previous poster noted, a BOM is perfectly meaningless in UTF-8, and it's hard for me to imagine any mainstream code ever putting one there. (Although, yes, a simpleminded conversion from UTF-16 might be likely to.) —scs (talk) 14:42, 11 June 2021 (UTC)[reply]
I can confirm that this causes Terminal to quit unexpectedly (macOS 10.14.6, Terminal version 2.9.5).  --Lambiam 08:58, 11 June 2021 (UTC)[reply]
Thanks for confirming, although that was quite unnecessary! I hope you didn't lose any work. :-) (I didn't, either, although it was a near thing.)
Who knew that "echo 77u/CQ== | base64 -D" could be a kiss-of-death? (Don't try this at home, kids.) —scs (talk) 14:42, 11 June 2021 (UTC)[reply]
It was not clear to me from your first comment that you found this out the hard way, rather than reading it on some blog. Apparently some program you used put that useless BOMb there. I only see the issue when the whole output consists of four 8-bit bytes, so that tab is the last byte.  --Lambiam 13:51, 12 June 2021 (UTC)[reply]
@Lambiam: My first comment was just an offhand interjection into the discussion to second the notion that, yes, BOM's in UTF-8 are goofy. I probably shouldn't have posted it at all. I wasn't looking for an explanation. I know exactly which company's program I used did it, but I decided not to mention it, because I bash that company way too much already. —scs (talk) 03:45, 13 June 2021 (UTC)[reply]