Some files won't download (Unicode support issue?)

doomkrad's picture

Hi Nir. : )

For the 2nd time I'm encountering a situation where characters in a file name keep me from downloading (the latter). Here are the details for both:

- user cyanosis
- folder 'photek - dj kicks...'
- track 1.03, title 'Hot Toddy ?eat Ron Basejam...'
I assume the character - shown by my client as '?' - is the culprit.

- user Backintosh
- folder contains 'Applied Rhythmic Technology'
- track 09, title contains the string '|sata|' ([vertical bar]"sata"[vertical bar]).

Many thanks in advance.

I think this is a problem with the old Soulseek client where certain international characters make the filenames index improperly.. These problems should happen less and less as more people switch to SoulseekQt.

doomkrad's picture

[SoulSeekQt 2013.7.10, WinXP SP3 (ENGLISH) ]

Hi Nir!
I just discovered a feature called 'Code Page to use to translate non-Unicode characters'.

1. Is this the place I can 'tweak' file names, so that my WinBlows system accepts to download them?
and
2. if so, WHICH Code Page should I pick from the dropdown list?! I have no clue...

Many thanks in advance.

Kind regards,
Valentin

There are two problems, actually.

As Nir said, in SoulseekNS, non-ANSI characters in filenames are translated to "?" in the sharer's index. (The definition of ANSI characters depends on the localized version of Windows.) When you request one of these files, you're asking for it with the "?", so it'll match what's in the index, but their client will not be able to find a file with that name on disk, because it doesn't exist with a "?" in its name... and even if it did, Windows would forbid reading it because the Windows APIs forbid "?" in filenames. The solution is for the sender to use a better client; there's nothing you can do on your end.

"|" is like "?" in that Windows APIs, mainly for backward compatibility, forbid reading and writing files or folders with that character anywhere in the path. So this is not an encoding or indexing issue; it's just a limitation of Windows. The sharer, running a non-Windows OS, can create filenames with that character in them, and you can request those files...but you, running Windows, are not able to create a filename with that character, so the download fails. If you were running another OS, you could download the file (even to an NTFS-formatted disk), but you couldn't access it if you attached the disk to a Windows box.

I have previously suggested that SoulseekQt make substitutions of the disallowed characters on Windows (it's a short list), but Nir has not implemented this feature.

doomkrad's picture

this has clarified the issue for me.

Just for the record:
if the aforementioned feature isn't implemented, the 'receiving users' will be left at the mercy of the 'sharing users'. I for one had to give up hope on those files in both cases: I contacted the users and asked for assistance, but - for whatever reason - got no reply from them. :-(

Thanks again.

doomkrad's picture

[SoulSeekQt 2013.7.10, WinXP SP3 (ENGLISH) ]

Hi Nir!
I just discovered a feature called 'Code Page to use to translate non-Unicode characters'.

1. Is this the place I can 'tweak' file names, so that my WinBlows system accepts to download them?
and
2. if so, WHICH Code Page should I pick from the dropdown list?! I have no clue...

Many thanks in advance Nir.

Kind regards,
Valentin

p.s. I'm reposting this here because I fear I posted it at the wrong spot initially. Thank you.

For most non-english (diacritic) symbols in western/central/northern european languages use windows-1252 or CP1252 codepage. If you mostly deal with cyrillic - use windows-1251 or CP1251.