66% speedup for Path.Combine on Windows#11293
Conversation
|
Seems reasonable to me. Thanks! It would also be interesting to understand whether the implementation in coreclr is actually providing an improvement for typical use cases; maybe there's a threshold that should be applied, e.g. the current implementation is used when the length of the string and the length of the char array are greater than some determined thresholds, otherwise a simple double-loop approach is employed. cc: @jkotas |
This loop will be always faster. While you are on it, you may also get rid of |
I agree in this case, especially since most of the elements can be grouped into a single comparison. I was referring more generally to other places where IndexOfAny is used, without a priori knowledge of such groupings. Presumably there is some cross-over point where all of that up-front work done by the current implementation provides value... if yes, and if that value isn't 0, seems like we should put another implementation in place for the cases that suffer under that implementation's weight... and if no, then it seems like we should delete that "optimization" from coreclr entirely. |
|
@stephentoub I wrote an API proposal that did some study of this after discovering the bottleneck yesterday. To my understanding, unless a character is found at maybe the first 4 indices in the string, the current implementation will always be faster. The reason is that the "naive" approach makes a pass over the array for each char in the string, so that's O(m * n). With the bitmap approach, it takes 1 pass over the array to initialize the bitmap, and then checking if a character is contained in the bitmap is just a few bit operations which are O(1), much cheaper than making another pass. So the complexity there is closer to O(m + n). |
What if, for example, the char[] only has two chars in it? |
|
@stephentoub I just did some benchmarking over here with arrays/strings of varying lengths. It looks like the cutoff may have been higher than I expected for smaller arrays; for length-2 arrays it seems to be at around index 9, for length 3/4 it's about index 7. I have just opened dotnet/coreclr#7017 to track if we should apply some kind of threshold before which we just use a naive loop. |
I'll fix this after this is merged. |
|
Thanks! I'll look into porting this to desktop. |
66% speedup for Path.Combine on Windows Commit migrated from dotnet/corefx@980aa1b
I ran some iterations of
Path.Combinethrough PerfView recently to see if there was anything that could be optimized. It turns out that this check, which callsIndexOfAnywith a gigantic char array, was taking up as much of 80% of CPU time; in fact this method alone, which initializes a "probabilistic map" to represent if a char is in the array, was where the program spent 60% of its time.Changing the method to use a manual, "naive" implementation of
IndexOfAnyresults in an approximately 3x speedup forPath.Combine, from ~2.3s to 0.7s. You can see the test app and results I made for benchmarking here.Other note: I also removed an apparently dead optional parameter that's not being used anywhere.
cc @JeremyKuhne, @stephentoub