Skip to content

Comments

WIP: Demo using Simple Case Folding#8515

Closed
iSazonov wants to merge 45 commits intoPowerShell:masterfrom
iSazonov:add-unicode4-2level-cache
Closed

WIP: Demo using Simple Case Folding#8515
iSazonov wants to merge 45 commits intoPowerShell:masterfrom
iSazonov:add-unicode4-2level-cache

Conversation

@iSazonov
Copy link
Collaborator

@iSazonov iSazonov commented Dec 21, 2018

PR Summary

Related #8120.

The PR is only demo how we could use Simple Case Folding to speed up PowerShell engine.

Current code uses 2-level translation table to fold Unicode strings. The table is compact: full table is 128 Kb, the compressed table is ~10Kb.

New string compare works in 3-4 faster than standard ignore case comparer.
With using the new comparer in only one place you can also see a marked improvement in startup time.
Perhaps we could use the comparer in other places.

I push alternative code too so you could experiment. Tools and temporary benchmarks is also pushed.

PR Checklist

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only place of new code injection.

@iSazonov iSazonov force-pushed the add-unicode4-2level-cache branch 4 times, most recently from 0727391 to c09b5cd Compare December 26, 2018 08:48
@iSazonov iSazonov force-pushed the add-unicode4-2level-cache branch from a98fe70 to aff75e4 Compare December 28, 2018 14:54
Use 32-bit sparsed array with BinarySearch

Create init tests

Got fast SpanFold and improve perf tests

Improve CompareFolded

Move files to new folders by role

Enable perf test

Add simple folded string comparer

Use IgnoresAccessChecksToAttribute

Configure visibility internal API

Add comments. Fix for surrogates

Update names and comments

Add SimpleFoldedStringComparer constructor

Use SimpleFoldedStringComparer in PSMemberInfoInternalCollection
Add AssertExtensions from CoreFX

Use namespace System.Management.Automation.Unicode.Tests

Add namespace in CharTests

Remove duplicate tests from CharTests
Remove unneeded tests for comparer
Remove unneeded IndexOfFolded() tests
@iSazonov iSazonov force-pushed the add-unicode4-2level-cache branch from 9939013 to a03ec72 Compare December 29, 2018 03:33
@iSazonov iSazonov force-pushed the add-unicode4-2level-cache branch from d1da272 to 311ccd7 Compare December 29, 2018 12:15
/// <returns>
/// Returns folded string.
/// </returns>
//[MethodImpl(MethodImplOptions.AggressiveInlining)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of commented code and attributes in this file. Did you write this all yourself?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started with zero experience in Unicode internals and c# in-depth optimizations. For the last two months I have been moving in small steps so that the code contains a lot of commented code which reflects my attempts to find fastest code. I'll remove its after the code will become stable.
All this code is written by me except for gen.cs that come from @tarekgh's gist.

internal static partial class SimpleCaseFolding
{
private static readonly ushort[] L1 =
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some comments about what L1 and L2 represent?

Copy link
Collaborator Author

@iSazonov iSazonov Jan 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idea is to use an array for mapping (simple case folding) a source char to target char. A problem is that SimpleCase.txt contains very small account (1.5Kb) of Unicode code points from Plane0 (SMP) and Plane1 - it is 128 Kb that's a lot for the mapping array. In related Dotnet CoreFXlab issue dotnet/corefxlab#2610 @tarekgh suggested to use the 3-level (8-4-4) mapping technique (and published a gen.cs code in gist). It is ~6Kb. This turned out to be much slower than a single-level array and I tried a two-level array. It is ~10Kb.
Yesterday I found that I can use 1-level mapping for chars < 0x5ff and 2-level mapping for other chars. It is ~10Kb too and fastest case I have found. I'll push the commit today.
I guess the current code still has a lot of bugs and I consider it as "alfa". Following I plan to fix surrogates.

@TravisEz13
Copy link
Member

What is the startup time difference you see?

@iSazonov
Copy link
Collaborator Author

iSazonov commented Jan 4, 2019

@TravisEz13 My first measurement with a single-level array showed a speedup of ~8 percent. That's why I continued working on it. Of cause this result is not reliable although the standard powershell (Pester) tests pass successfully (only my xUnit tests fail currently).
I also believe that there are other places where we could inject this code to get a performance win and I need a help to find its.

@TravisEz13
Copy link
Member

Talking to @daxian-dbw his assumption is the actual folding code would go into DotNet.

@tarekgh
Copy link

tarekgh commented Jan 4, 2019

@TravisEz13

Talking to @daxian-dbw his assumption is the actual folding code would go into DotNet.

There is no plan or decision that this will happen. it depends on the case and DotNet team will approve it. My thoughts so far is to create this in a separate NuGet package and if we see a demand on this functionality we can consider moving it to the core.

@iSazonov
Copy link
Collaborator Author

iSazonov commented Jan 4, 2019

I agree that this should be in CoreFX. At best, it will be CoreFX 3.1 and that means that PowerShell will get this advantage only a year and a half or two years from today. This is too long to wait. So I think it is best to work here and in CoreFX at the same time.
My last perf test shows (sorry I still do not push the commit) that new comparer is faster than CoreFX OrdinalIgnoreCase:

                Method |         StrA |         StrB |     Mean |     Error |    StdDev | Ratio |
---------------------- |------------- |------------- |---------:|----------:|----------:|------:|
         CoreFXCompare | CaseFolding1 | cASEfOLDING2 | 41.77 ns | 0.1741 ns | 0.1454 ns |  1.00 |
 SimpleCaseFoldCompare | CaseFolding1 | cASEfOLDING2 | 37.34 ns | 0.2286 ns | 0.1909 ns |  0.89 |
                       |              |              |          |           |           |       |
         CoreFXCompare | ЯяЯяЯяЯяЯяЯ1 | ЯяЯяЯяЯяЯяЯ2 | 87.32 ns | 0.4460 ns | 0.4172 ns |  1.00 |
 SimpleCaseFoldCompare | ЯяЯяЯяЯяЯяЯ1 | ЯяЯяЯяЯяЯяЯ2 | 37.68 ns | 0.5356 ns | 0.4748 ns |  0.43 |

I hope this works correctly.
In this case, CoreFX team may be more interested in speeding up this work.

@TravisEz13
Copy link
Member

What locale are those stats for?

@iSazonov
Copy link
Collaborator Author

iSazonov commented Jan 5, 2019

@TravisEz13 In the test I used Russian chars. Really the result is true for chars with codepoints < 0x5ff (1-level mapping is used) that is most of Latin and European codepoints. For codepoint above a 2-level mapping is used and result is ~68 ms (vs 37 ms) that is still faster than CoreFX OrdinalIgnoreCase.

@stale
Copy link

stale bot commented Feb 4, 2019

This PR has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs within 10 days.
Thank you for your contributions.
Community members are welcome to grab these works.

@stale stale bot added the Stale label Feb 4, 2019
@iSazonov
Copy link
Collaborator Author

iSazonov commented Feb 5, 2019

Now we have a PR in corefxlab repo and I hope we get the experimental package soon.

@stale
Copy link

stale bot commented Feb 15, 2019

This PR has been automatically closed because it is stale. If you wish to continue working on the PR, please first update the PR, then reopen it.
Thanks again for your contribution.
Community members are welcome to grab these works.

@stale stale bot closed this Feb 15, 2019
@iSazonov iSazonov deleted the add-unicode4-2level-cache branch October 16, 2021 05:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants