-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Special-case T==char in string.Concat(IEnumerable<T>) #14298
Conversation
This allows string.Concat to be used as an efficient mechanism for creating a `string` from an `IEnumerable<char>`. The JIT specializes the implementation for char vs non-char, so there's minimal impact on the non-char case, and for the char case, we a) avoid creating a string for each individual char, and b) use StringBuilder's fast path for adding individual chars. This can result in a massive allocation savings for long enumerations (for <= 1 char, there's no difference, but each character after that is an allocation saved), and for more than a few characters can yield up to a 2x increase in throughput.
// Special-case T==char, as we can handle that case much more efficiently, | ||
// and string.Concat(IEnumerable<char>) can be used as an efficient | ||
// enumerable-based equivalent of new string(char[]). | ||
using (IEnumerator<char> en = Unsafe.As<IEnumerable<char>>(values).GetEnumerator()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndyAyersMS, I previously had this as:
IEnumerable<char> charsValues = (IEnumerable<char>)values;
but that was causing the JIT to emit CORINFO_HELP_CHKCASTANY, which added a measurable overhead for small enumerables; switching to use Unsafe.As fixed the regression. Is it expected that the JIT wasn't able to remove what was effectively a IEnumerable<char>
to IEnumerable<char>
cast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is code in the jit's assertion prop for cast optimization. Seems like it ought to be firing in a case like this. I'll take a closer look.
My intention is to also add support for optimizing casts into the jit's front-end (similar to what I have been doing recently for type equality checks). The earlier we can prune away code in these type-specializing kinds of methods, the better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @AndyAyersMS. Here's a standalone repro:
using System;
using System.Collections.Generic;
class Program
{
static void Main() => MyMethod(new char[0]);
static void MyMethod<T>(IEnumerable<T> chars)
{
if (typeof(T) == typeof(char))
{
foreach (char c in (IEnumerable<char>)chars) { }
}
}
}
The beginning of the Jit_Disasm for MyMethod is:
G_M9149_IG01:
55 push rbp
4883EC30 sub rsp, 48
488D6C2430 lea rbp, [rsp+30H]
488965F0 mov qword ptr [rbp-10H], rsp
G_M9149_IG02:
488BD1 mov rdx, rcx
48B938CED75BF97F0000 mov rcx, 0x7FF95BD7CE38
E890EF395F call CORINFO_HELP_CHKCASTANY
488BC8 mov rcx, rax
49BB20009909F97F0000 mov r11, 0x7FF909990020
3909 cmp dword ptr [rcx], ecx
41FF13 call gword ptr [r11]System.Collections.Generic.IEnumerable`1[Char][System.Char]:GetEnumerrator():ref:this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertion prop doesn't fire because we don't seed the assertion table with known initial facts. In particular we arguably should assert that the types of arguments and locals are at least their declared types.
However, since the assertion table space is a scarce resource, we probably can't afford to start adding these initially known facts by default. In particular there can be a lot of locals, and at times the information is not useful -- for instance knowing that a ref type is at least object
doesn't add any value.
Opened #14308 to record these shortcomings of assertion prop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
@dotnet-bot test OSX10.12 x64 Checked Build and Test please ("Java / Jenkins hit a remoting error that caused the build to fail.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improvement looks good! thanks Steve
Thank you! |
This allows string.Concat to be used as an efficient mechanism for creating a
string
from anIEnumerable<char>
. The JIT specializes the implementation for char vs non-char, so there's minimal impact on the non-char case, and for the char case, we a) avoid creating a string for each individual char, and b) use StringBuilder's fast path for adding individual chars. This can result in a massive allocation savings for long enumerations (for <= 1 char, there's no difference, but each character after that is an allocation saved), and for more than a few characters can yield up to a 2x increase in throughput.Closes https://github.com/dotnet/corefx/issues/24395
cc: @joperezr, @AlexGhiondea, @jkotas